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1.  INTRODUCTION  AND  BACKGROUND 


1.1  Introduction 

Continued  increase  in  the  complexity  and  use  of  computer 
systems  in  a  wide  variety  of  applications/  especially  during  the 
last  decade,  has  necessitated  a  greater  emphasis  on  the  development 
of  cost-effective  and  reliable  software.  The  importance  of  soft¬ 
ware  has  been  further  enhanced  by  the  fact  that  the  ratio  of  soft¬ 
ware  to  hardware  costs  continues  to  grow  as  technological  advances 
keep  reducing  the  hardware  gost.  This  has  led  to  the  evolution  of 
a  new  discipline  called  software  engineering  (Reference  6).  This 
discipline  is  still  in  its  infancy  and  has  been  described  as  the 
practical  application  of  scientific  knowledge  to  produce  software 
in  a  way  that  is  cost-effective  and  reliable. 

The  performance  of  a  software  system  is  dependent  on  the 
tools  and  techniques  used  during  its  development  and  operation. 

An  important  performance  criterion  is  the  nature  and  frequency  of 
software  failures.  A  failure  is  said  to  occur  when  a  fault,  a 
specific  manifestation  of  an  error,  in  the  program  is  evoked  by 
some  input  data  resulting  in  the  computer  program  not  correctly 
computing  the  required  funfction.  A  software  error  is  a  conceptual, 
syntactical,  or  clerical  discrepancy  which  results  in  one  or  more 
faults  in  the  software.  It  should  be  noted  that  these  definitions 
are  controversial  and  not  uniformly  accepted.  To  be  consistent 
with  the  existing  literature,  in  this  report  the  terms  error  and 
failure  are  used  interchangeably,  except  where  indicated  otherwise. 


Several  empirical  studies  of  software  failure  phenomena  have 
been  undertaken  in  recent  years  with  the  objective  of  improving 
software  performance.  Such  studies  can  be  classified  into  one  (or 
both)  of  two  categories.  In  the  first  category  the  emphasis  is 
on  the  analysis  of  software  error  data  collected  from  small  or 
large  projects  (References  17,  22,  43,  56,  67,  70),  during  develop¬ 
ment  and/or  operational  phases.  Studies  in  the  second  category 
are  primarily  aimed  at  the  development  of  analytical  models 
(References  7,  14,  24,  25,  26,  28,  31,  36,  46,  58,  60,  61,  66,  69, 
71). 

A  number  of  software  reliability  models  have  been  proposed 
and  investigated  during  the  last  seven  years  to  describe  the 
stochastic  behavior  of  the  software  failure  process  and  to  esti¬ 
mate  the  number  of  software  errors  remaining  in  the  system.  We 
classify  these  models  into  two  major  categories.  The  first  one 
emphasizes  the  stochastic  nature  of  software  failures,  while  the 
second  approach  uses  combinatorial  analysis  to  provide  measures 
of  software  reliability. 

The  basis  of  the  first  approach  is  the  reliability  theory 
developed  for  hardware  systems.  Since  the  error  detection  rate 
changes  during  the  software  development  cycle,  the  models  have 
been  modified  to  incorporate  this  feature  of  the  software  failure 
phenomenon.  The  time  between  failures  is  usually  assumed  to  be 
exponential  with  a  parameter  that  changes  with  the  number  of  remain¬ 
ing  errors  in  this  class  of  models  (References  26,  31,  46,  58,  61). 
Some  work  has  been  done  using  a  Bayesian  approach  (References  23, 

36)  in  which  the  time  to  next  failure  is  taken  to  be  dependent  on 


the  previous  failure  history.  Software  reliability  has  also  been 
modelled  based  on  the  number  of  errors  found  during  the  debugging 
phase  (Reference  34).  Also  of  interest  in  this  category  are  the 
models  for  software  availability  (References  49,  50,  68). 

As  mentioned  above,  in  the  second  approach  software  reli¬ 
ability  is  measured  using  combinatorial  analysis  which  includes 
capture-recapture  sampling  (or  error  seeding)  models  and  input 
data  domain  models.  The  objective  of  capture-recapture  sampling 
is  to  determine  the  number  of  remaining  errors  in  a  computer  pro¬ 
gram  by  introducing  (or  seeding)  errors  and  then  using  classical 
statistical  techniques  for  estimation,  computer  systems  are 
initialized  and  exercised  by  a  controlled  input  data  set  and  the 
reliability  of  the  program  is  estimated  by  running  the  program  for 
all  such  possible  input  data  sets.  This  is  the  input  data  domain 
approach  which  has  serious  limitations  from  a  practical  viewpoint. 

A  classification  of  the  above  models  is  shown  in  Figure  1.1. 
Each  of  these  categories  is  described  in  more  detail  in  Appendix  A. 
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1.2  Purpose  and  Outline  of  the  Report 


In  this  report  our  objective  is  to  develop  a  parsimonious 
model  whose  parameters  have  a  physical  interpretation,  and  which 
can  be  used  to  predict  various  quantitative  measures  for  software 
performance  assessment.  Also  of  interest  is  the  applicability  of 
the  model  over  a  broad  class  of  projects.  Further,  it  should  be 
possible  to  estimate  the  parameters  of  the  model  from  available 
failure  data  which  could  be  given  as  either  the  number  of  failures 
in  specified  time  intervals  or  as  times  between  software  failures. 

With  this  objective,  we  develop  and  investigate  a  non- 
homogeneous  Poisson  process  (NHPP)  (Reference  9)  model  with  a  time 
dependent  error  detection  rate  for  the  software  failure  phenomenon. 
By  studying  the  behavior  of  the  counting  process,  (N(t)  ,  t  >0],  the 
number  of  failures  by  time  t,  it  is  shown  in  Section  2  that  N(t) 
can  be  well  described  by  a  non-homogeneous  Poisson  process  (NHPP) 
with  a  two  parameter  exponentially  decaying  error  detection  rate. 

NHPP  has  been  used  by  many  researchers  to  describe  random 
phenomena  in  various  applications  (References  13,  15,  16).  Some 
such  applications  are  the  occurrences  of  coal  mining  disasters 
(Reference  39);  equipment  failures  (References  16,  32,  51);  trans¬ 
actions  in  a  data-base  system  (Reference  33),  and  software  error 
counts  over  a  series  of  time  intervals  (Reference  60).  Various 
forms  of  the  intensity  function  for  the  NHPP  used  in  actual  appli¬ 
cations  are  the  exponential  polynomial  rate  function  (Reference  33), 
a  log-linear  rate  function  (Reference  11)  and  a  Weibull  rate  func¬ 
tion  (References  13,  15,  44). 


* 
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Several  measures  for  software  performance  assessment,  such 
as  the  number  of  errors  remaining  in  the  system,  distribution  of 


time  to  next  failure,  and  software  reliability  are  proposed  in 
Section  3.  Based  on  the  NHPP  model,  expressions  are  then  derived 
for  obtaining  the  estimates  and  confidence  limits  for  these  per¬ 
formance  measures. 

Two  methods  are  described  in  Section  4  for  estimating  the 
parameters  of  the  model  from  available  failure  data.  The  first 
one  is  for  the  case  when  data  is  given  in  the  form  of  number  of 
failures  in  given  time  intervals.  The  time  intervals  can  be  of 
equal  or  unequal  lengths.  The  second  method  is  used  when  times 
between  software  failures  are  given. 

In  Section  5,  a  method  for  testing  the  goodness-of~f it  is 
developed  based  on  the  Kolmogorov-Smirnov  test.  Expressions  ‘for 
performing  this  test  are  also  derived. 

A  general  methodology  for  analyzing  software  failure  data  is 
presented  in  Section  6.  It  gives  a  step  by  step  procedure  of  analysis 
starting  from  raw  data  to  the  computation  of  useable  performance 
measures.  This  methodology  is  employed  in  Sections  7  and  8  to 
give  a  detailed  analysis  of  failure  data  from  two  software  systems. 

The  first  one  (Section  7)  is  a  large  command  and  control  system 
while  the  second  (Section  8)  is  a  relatively  small  Naval  Tactical 
Data  System  (NTDS).  A  comparison  of  NTDS  data  analysis  using  the 
NHPP  and  the  De-Eutrophication  process  models  is  presented  in 
Section  9.  Some  concluding  remarks,  limitations  and  advantages  of 
the  NHPP  model  are  summarized  in  Section  10. 
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2.  MODEL  DEVELOPMENT 


A  software  system  in  use  is  subject  to  failures  caused  by 
errors  present  in  the  system.  The  errors  are  encountered  when  a 
sequence  of  instructions  is  executed  which,  in  turn,  defends  on  the 
input  data  set.  In  this  section  we  develop  a  model  to  describe 
this  failure  occurrence  phenomenon. 

2.1  Deterministic  Analysis  of  Software  Failure  Process 

It  is  useful  to  first  make  a  simpler  analysis  by  ignoring 
the  statistical  fluctuations  in  the  number  of  software  failures 
before  analyzing  the  failure  phenomenon  as  a  stochastic  process 
(Reference  12).  Let  n(t)  denote  the  cumulative  number  of  software 
failures  detected  by  time  t.  Assume  that  n(t)  is  large  enough 
so  that  it  can  be  expressed  as  a  continuous  function  of  t  .  Since 
the  number  of  errors  in  a  system  is  a  finite  value,  n(t)  is  a 
bounded  non- decreasing  function  of  t  with 

n(0)  =  0  and  n(“)  =  a  .  (2.1) 

For  purposes  of  modeling  we  assume  that  the  usage  of  the  system  is 
basically  similar  over  time.  Then  the  number  of  failures  in 
(t,t+&t)  is  proportional  to  the  number  of  undetected  errors  at  t, 
i.e.  , 

n (t+ At)  -  n (t )  =  b( a-n (t) } Lt  ,  (2.2) 

where  b  is  a  proportionality  constant. 

A  graphical  representation  of  the  above  description  is  pro¬ 
vided  in  Figure  2.1. 
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FIGURE  2.1  A  GRAPHICAL  REPRESENTATION  OF 

THE  DETERMINISTIC  MODEL  FOR  SOFT 
WARE  FAILURES 
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Now,  from  Equation  (2.2)  we  get  the  differential  equation 

n 1 ( t )  =  ab  -  bn ’ ( t )  .  (2.3) 

Taking  the  Laplace  transform  (References  1,  10)  of  Equation  (2.3) 
under  the  conditions  of  Equation  (2.1),  we  have 

sn (s )  =  ab  -  bn (s )  , 

or  n(s)  =  ,  (2.4) 

where 


n(s)  =  [  e‘st  •  dn ( t)  .  (2.5) 

J0 

The  solution  of  Equation  (2.3)  is  thus  obtained  by  inverting 
Equation  (2.4)  and  is  given  by 

n(t)  =  a(l  -  e”bt)  .  (2.6) 

Under  the  assumptions  discussed  above,  Equation  (2.6)  is  the 
deterministic  model  of  the  software  failure  process.  For  given  a 
and  b ,  we  can  easily  compute  the  number  of  failures  to  be  encountered 
by  some  time  t  so  that  the  failure  phenomenon  can  be  described  with 
certainty.  It  should  be  noted,  however,  that  the  actual  failure 
phenomenon  is  not  deterministic. 
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2.2  Stochastic  Analysis  of  Software  Failure  Process 

In  an  actual  usage  the  software  system  is  subjected  to  random 
inputs  causing  the  failures  to  occur  at  random  times,  i.e.,  the 
failure  phenomenon  is  stochastic  (non-deterministic ) .  Therefore, 
a  realistic  description  of  the  failure  process  must  incorporate 
this  randomness. 

Let  (N (t)  ,  t>0)  be  a  counting  process  (References  52,  54,  62) 
representing  the  cumulative  number  of  failures  by  time  t  .  (Note  that 
N(t)  is  a  random  variable  while  n(t)  above  was  taken  to  be  deterministic.' 
Assuming  that  each  failure  is  caused  by  one  error,  N(t)  also  repre¬ 
sents  the  cumulative  number  of  errors  detected  by  time  t  .  It  should 
be  pointed  out  that  a  detected  error  may  not  be  removed  and  as  a 
result  may  cause  additional  failure (s)  at  a  later  stage.  For  the 
N(t)  process,  such  recurrences  are  counted  as  new  events. 

Let  m(t)  be  the  mean  value  function  of  the  N(t)  process, 

i.e., 

m(t)  s  E [N (t ) ]  .  (2.7) 

Since  m(t)  represents  the  expected  number  of  software  failures 
or  detected  errors  by  time  t  ,  it  is  a  non-decreasing  function  of 
t  .  If  we  assume  that  there  will  be  a  finite  number  of  errors  to 
be  detected  in  an  infinite  amount  of  time,  m(t)  has  the  following 
boundary  conditions : 

0  ,  t  =  0 

m(t)  =  (2.8) 

a  ,  t  =  » 

where  a  <  ®  and  represents  the  expected  number  of  software  errors 


1C 


to  be  eventually  detected.  Furthermore,  it  is  assumed  that  for 
small  At  the  expected  number  of  software  failures  during  (t,t+At) 
is  proportional  to  the  expected  number  of  software  errors  undetected 
by  time  t  ,  i. e. , 


m(t+At)-m(t)  =b(a-m(t))At 


(2.9) 


where  b  is  a  constant  of  proportionality.  Solving  the  differential 
equation  obtained  from  Equation  (2.9)  under  the  boundary  conditions 
of  Equation  (2.8),  we  get 


m(t)  =  a(l  -  e“bt)  . 


(2.10) 


This  equation  specifies  the  mean  value  function  for  the  underlying 
software  failure  counting  process  N(t)  .  The  intensity  function, 
obtained  by  taking  the  derivative  of  m(t)  ,  represents  the  error 
detection  rate  at  time  t  and  is  given  by 


X ( t )  =  m ' ( t )  =  abe 


(2.11) 


We  now  study  the  probabilistic  behavior  of  the  N(t)  process 
by  using  m(t)  and  X(t)  .  Since  there  are  no  failures  at  t=0  , 
we  have  N(0)  =  0.  It  is  also  reasonable  to  assume  that  the  number 
of  software  failures  during  non-overlapping  time  intervals  are 
independent.  In  other  words,  for  any  finite  collection  of  times 
bl  <  b2  <  *  *  *  <  fcn'  tbe  n  ran^otn  variables  N(t^)  ,  (NftjJ-Nft^)  ) , .  .  . , 
(N(tn)-N(tn  ^) J  are  statistically  independent.  This  implies  that 
the  counting  process  (N(t)  ,  t>0]  has  independent  increments. 
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We  assign  the  probabilities  on  the  increments  of  the  N(t) 
process  as  follows. 


N(t+At)  -  N(t) 


with  probability 
with  probability 
with  probability 


1-X (t) At+o(At) 
X  ( t )  At+o  ( At) 
o(At) 


(2.12) 


where 

°  ^  -*  0  as  At  -*  0  . 

The  underlying  N(t)  process  satisfying  conditions  of  Equation  (2_12) 
is  now  a  NHPP  with  mean  value  function  m(t)  and  intensity  function 
X(t)  as  given  in  Equations  (2.10)  and  (2.11),  respectively  (References 
18,  19).  Hence  the  distribution  of  N(t)  is  given  by 

P(N (t)=yj  =  e~m(t)  ,  y*  0,1,2,...  .  (2.13) 

Under  the  assumptions  discussed  above,  the  stochastic  behavior  of 
the  software  failure  phenomenon  can  be  completely  described  by 
Equation  (2.13).  It  should  be  pointed  out  that  Equation  (2.9) 
implies  that  the  ratio 


Number  of  errors  detected  during  (t,t+At) 
(Number  of  errors  undetected  by  t)At 


(2.14) 
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is  constant  at  any  time  t  .  Therefore,  b  can  be  interpreted  as 
the  error  detection  rate  per  error. 

Equations  (2.10)  and  (2.13)  constitute  the  basic  software 
failure  model  under  study  in  this  report. 


3.  MODEI£  FOR  SOFTWARE  PERFORMANCE  ASSESSMENT 


The  model  developed  in  Section  2  is  a  description  of  the 
failure  phenomenon.  In  order  to  use  this  model  to  predict  soft¬ 
ware  performance,  we  generally  need  expressions  for  quantitative 
measures  such  as  the  number  of  failures  by  some  prespecified  time, 
the  number  of  errors  remaining  in  the  software  at  a  future  time, 
and  software  reliability  during  a  mission.  In  this  section  we 
develop  models  that  can  be  employed  to  estimate  such  quantities. 


3.1  Number  of  Software  Errors  Detected  by  t 

For  given  a  and  b  the  distribution  of  N(t)  ,  the  cumula¬ 
tive  number  of  software  failures  detected  by-time  t  ,  is  obtained 
from  Equations  (2.10)  and  (2.13)  as 


P(N ( t ) -y ]  =  e~ad-e  ,  y  =  0,l,2,...  (3.1) 


-bt , 


In  other  words,  N(t)  has  a  Poisson  distribution  with  mean 


m (t)  —  E [N (t ) ]  =  a(l-e”bt) 


(3.2) 


Note  that 

ay  -a 

P(N(®)=y)  =  e  3  ,  y  =  0,1,2, -  (3.3) 

I.e.,  the  distribution  of  N(»)  ,  the  total  number  of  failures 
encountered  or  errors  detected  if  the  system  is  used  indefinitely, 
is  also  a  Poisson  distribution  with  mean  'a'.  This  result  is  con¬ 
sistent  with  theoretical  studies  which  indicate  that  the  Poisson 
process  is  the  limiting  distribution  of  many  phenomena  similar  to 
the  software  error  occurrence  phenomenon  (References  41,  62). 


3.2  Number  of  Remaining  Errors  and  Related  Results 

We  have  been  considering  the  number  of  failures  encountered 
by  time  t ,  N(t)  .  Since  many  of  the  performance  measures  depend 
on  the  number  of  errors  remaining  in  the  system,  we  now  consider 
this  phenomenon. 

Let  N  { t )  be  the  number  of  errors  remaining  in  the  system 
at  time  t ,  i. e. , 

N(t)  =  N{“)  -N(t)  ,  (3.4) 

The  expectation  of  N(t)  is  given  by 

E  [N  ( t )  ]  =  ae“bt  .  (3.5) 

The  conditional  distribution  of  N(t)  ,  given  N(t)=y,  is 
obtained  as  follows; 


p(N(t)=xlN(t)=yj  =  P(N(®)  =  y+x) 


ay+x 

(y+x): 


-a 

e 


x  -  0, 1, 2, . .  .  . 


(3.6) 


This  conditional  distribution  is  important  for  deciding  whether 
the  software  system  under  development  can  be  released  or  not.  The 
decision  should  be  made  based  on  the  number  of  errors  remaining 
in  the  software  because  this  quantity  plays  an  important  role  in 
software  reliability  assessment.  Suppose  that  the  decision-maker 
conducts  an  experiment  and  finds  y  software  errors  by  time  t  . 
Then,  a  decision  might  be  to 


Accept  if  N(t)  <  nQ 

and 

Reject  if  N ( t )  >  n^ 
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where  nQ  is  some  specified  number.  For  this  decision  rule,  the 
probability  that  the  software  system  is  accepted  for  a  given  number 
of  failures  y  by  time  t  is 

P( Accept J  =  P(N ( t)  < nQ IN (t)  =  yj 
and,  using  Equation  (3.6),  becomes 

_  y+  i 

P { Accept )  =  2,  '(y+i):'  e~a  •  (3*7) 

i=0 

The  conditional  expectation  of  N(t)  ,  given  N(t)  =y  ,  is 
give  \  by 

E  [N  ( t )  IN  (t )  =y]  =  E[N(°°)“N(t)  lN(t)=y] 

»  E[N(»)~y) 

or  E[N(t) lN(t)=y]  =  a  -  y  .  (3.8) 

Therefore,  the  expected  number  of  errors  remaining  in  the  software 
system  at  time  t ,  given  that  y  errors  have  been  detected  during 
the  testing  period  t  ,  is  simply  the  expected  number  of  failures 
to  be  encountered  during  [0,®)  less  the  number  of  errors  detected 
during  the  period  [0,t]. 

As  we  can  see,  the  parameter  ’a'  plays  a  crucial  role  in  this 

study. 
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3.3  Software  Reliability 


Let  a  sequence  of  random  variables  [x^  ,  i  =  l,2,...J  denote 
a  sequence  of  times  between  software  failures  associated  with  the 
N (t)  process.  Then  denotes  the  time  between  the  (i-l)st  and 
the  ith  failures.  We  also  define 

n 

sn  H  Z  Xi  '  n=l,2,...  (3.9) 

i=l 

which  represents  the  time  to  the  nth  failure.  Let  ^(x)  be  the 
Cumulative  Distribution  Function  (cdf)  of  x^  ,  i.e.  , 


*v  (x)  =  P(X.  <x)  .  (3.10) 

X1  A 

Note  that  the  event  {x^>xj  implies  that  there  are  no  failures 
during  (0,x],  i.e.,  the  event  {N(x)-0J.  Then  using  Equation  (3.1)  the 
reliability  function  associated  with  the  first  failure  time  is 
given  by 


or 


R„  (x) 

X1 


P(X1  >x)  =  P(N(X)=0  : 


-a(l-e"bx) 


(3.11) 


Now,  the  cdf  of  X-^  can  be  written  as 

#x  (*>  "  1  ”  (*) 

. 1  -bx . 

or  fv  (x)  =  1  -  e"*a  1-8  ’  .  (3.12) 

X1 

The  Probability  Density  Function  (pdf)  is  defined  as 
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so  that 


/ 


%'*>  -  s  *Xl(x) 


*“1dx 

cp  (x)  -  abe~bx  e"a(1“e  }  .  (3.13) 

X1 


Next  consider  the  conditional  probability  distribution,  i  .  (x!s)  , 

of  {X2lX^j.  The  event  (X2>xlx^  =  sj  implies  (no  failures  in  (s,s+x]j. 
Then  the  conditional  reliability  function  of  the  second  failure, 
given  that  the  first  failure  occurs  at  time  s  ,  is  given  by 
R^|x  (xls)  =  P(X2>xlX1-s] 

=  P{no  failures  in  (s,s+x]j 

=  P{N (s+x )-N(s )=0 } 

_  e-  (m(s+x)-m(s)] 


e-a(e“bs-e"b(s+x)] 


(3. 14) 


From  Equation  (3.14),  we  obtain 


Jx2ix1(x,s)  ~  1 "  Px2ix1(xls) 


.  -bs  -b(s+x), 
=  l-e_a[e  'e  ] 


(3.15) 


and 


tp. 


,x.ix,(xis)  f  *  VlX,(XlS) 


2  1 


2  1 


,3.16, 


Combining  Equations  (3.12)  and  (3.16),  we  get  the  joint  density 
of  X^  and  X^  as 
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r 


<p. 


(X,  , X- )  = 


XrX2  v  i*  2 


»X,IX  <X2|X1>,,,X,  (xl> 


x2  '~1 
-bx. 


-b(x.+x9)) 

=  (abe  ■L)  (abe  'L 


-bx. 


,,  ‘X.  ,  'bXl  •b(xl+x2>, 

-a ( 1-e  )  -af e  -  e  J 

x  e  '  'e  1 


2  2  “bxl  ~b(x1+x2)  -af l-e"b(Xl+X2) 1 
°r  , x  (X1'X2}  =  a  b  e  e  e  ali  6  J  .  (3.17) 

1  2 


Making  the  transformation  s^x^  /  s2=xi+x2  *  tbe  3°lnt  density  of 


and  S2  is 


-bs. 


t  \  *2,2  ~bSl  ”bS2  -a  (1-e  2) 

CPS1,S2(S1'S2)  =  a  b  e  e  e 


(3. 18) 


In  general,  it  can  be  shown  that  the  conditional  reliability 
function  of  X^  ,  given  ^  -  s  ,  is  given  by 


R  .  (xls)  =  e 
*klbk-l 


-a{e“bs>e"b(s+x)) 


(3.19) 
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3.4  Conditional  Distribution  of  X^IS^  ^ 

The  conditional  cdf  and  pdf  are  obtained  from  Equation  (3.19) 
by  recalling  that  R(x)  =l-$(x)  and  cp(x)=^$(x)  .  Thus,  we  have 


*k 


IS 


k-1 


-bs  -b ( s+x ) , 
(xls)  .  1  -  e-a^e  -e  ) 


(3.20) 


and 


<P 


V 


(xls) 


k-1 


=  *»-*<-*  ).-.t.-b,-e-b(wx)Ji 


(3.21) 


respectively. 

As  can  be  seen  from  the  above  equations,  the  time  to  the  next 
failure  depends  on  the  time  when  the  last  failure  occurs.  It  should 
be  noted  that  the  distributions  of  times  between  failures  are 
improper,  i.e.. 


k=l 


(®l  s) 


1 


<  1  . 


(3.22) 


This  is  due  to  the  fact  that  the  event  (no  failures  in  (0,®] }  is 
allowed  in  our  model.  Hence,  the  expectations  of  these  quantities 
do  not  exist.  This  type  of  behavior  does  not  cause  any  theoretical 
problems  in  analysis. 
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3.5  Joint  Density  of  waiting  Times 


As  defined  above,  (x^  »  k =  1, 2 , . . . }  denotes  the  sequence  of 
times  between  software  failures.  Then 

n 

S  =  2  X .  ,  n  —  1,2,... 

n  i=l  1 

is  called  the  waiting  time  to  the  nth  software  failure.  This  quantity 

is  quite  important  for  estimation  of  parameters  a  and  b  and,  hence, 

we  obtain  the  distribution  of  {S.,S-,...,S  ).  The  distribution 

x  z  n 

is  obtained  by  using  an  approach  similar  to  that  used  for  getting 
Equation  (3.17).  The  result  is  summarized  in  the  following  theorem. 


Theorem.  The  joint  probability  density  of  S^,S2,...,Sn  is  given 
by 


,S 


(s. 


n 


,sn)  =  (ab) 


n 


n 

-b  £ 

.  i=l 


-bs 


-a(l-e 


n, 


(3.23) 


where  s.,s_,...,s  denote  the  realizations  of  SwS~,...,S  , 

i  i  n  x  i  n 

respectively. 

The  density  can  also  be  written  as 


<P 


S 


1 


9 


sn> 


-m(s  )  n 
e  n  n  Ms.) 
k=l  K 


(3.24) 


J  '  V 

where  X  (sk)  =  {m(sk) }  and  m(sk)=a(l-e  K). 
k 

this  theorem,  see  References  11  and  15. 


For  a  proof  of 


Equation  (3.23)  will  be  used  in  Section  4  to  estimate  a 
and  b  based  on  observed  data  £  =  (s^,...,s  )  . 


2.1 


3.6  Joint  Counting  Probability 

The  property  of  independent  increments,  along  with  Equations 

(2.8)  and  (2.12)  of  Section  2,  provides  a  complete  statistical 

characterization  for  NHPP  so  that  the  joint  counting  probability 

can  be  determined  for  any  collection  of  times  0  <  t.  <  t-  <  . . .  <  t  . 

1  2.  n 

That  is,  with  t^  =  0  ,  y^  =  0  , 


P{N(t1)  =  yrN(t2)  =  y2,  ...,  N(tn)  =  yn) 


n 


pfN(t.)-N(ti_1)=yi-yi_1} 


yi“yi-l 

n  fm(t.)  -m(ti_1)]  1  -m(t  ) 

n  -  e 


i=1 


(3.25) 


Equation  (3.25)  is  needed  for  estimating  the  parameters  a  and  b 
for  given  data  ((yjL,ti)  ,  i  =  1,  2, . . . ,  n}  ,  as  will  be  seen  in 
Section  4. 
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4.  ESTIMATION  OF  MODEL  PARAMETERS  FROM  FAILURE  DATA 

The  basic  models  for  the  failure  process  and  performance 
measures  wore  developed  in  Sections  2  and  3,  respectively.  In 
order  to  use  these  models  for  software  performance  assessment,  the 
only  parameters  to  be  specified  are  the  total  expected  number  of 
errors  to  be  detected,  a  ,  and  the  error  detection  rate  per  error, 
b  .  In  other  words,  for  given  a  and  b ,  various  useful  quantities 
can  be  computed  from  the  relevant  equations  in  Sections  2  and  3. 

In  general,  a  and  b  are  not  known  for  a  specific  software 
system  and  are  estimated  from  the  available  data  generated  during 
testing.  However,  that  is  not  the  only  way  to  get  a  and  b .  One 
may  be  willing  to  extrapolate  these  values  based  on  the  data  from 
one  or  more  "similar"  systems.  Another  method  would  be  to  use  a 
Bayesian  approach,  whereby  knowledge  about  a  and  b  can  be  expressed 
as  prior  distributions  and  used  for  performance  assessment.  This 
approach  can  also  be  used  in  conjunction  with  available  data  and 
is  specially  useful  when  failure  data  are  scarce  or  expensive  to 
collect.  Formal  analytical  expressions  for  the  Bayesian  approach 
are  currently  under  development  and  will  be  reported  in  a  separate 
technical  report. 

The  purpose  of  this  section  is  to  describe  methods  for 
estimating  a  and  b  from  failure  data.  Use  of  these  methods  is 
discussed  in  Sections  7  and  8  via  failure  data  from  operational 
systems.  Such  data  are  generally  available  as 

(i)  total  number  of  failures  (errors)  in  given  time  intervals 
and/or  as 

(ii)  times  between  failures. 


Most  of  the  available  data  is  given  in  the  form  of  number 
of  failures  in  given  time  intervals.  The  data  on  time  between 
failures  is  very  rare.  Both  of  these  cases  are  considered  below. 
Expressions  are  also  derived  for  calculating  the  joint  confidence 
regions  for  a  and  b . 


4.1  Estimation  When  Cumulative  Failures  are  Given 


We  first  consider  the  case  when  data  are  available  as  cumula- 

/ 

tive  number  of  failures  in  given  time  intervals.  Suppose 
^l'^2'‘**'yn  are  t*ie  cumulative  number  of  failures  detected  by  times 
tl* fc2' ' ‘ * ' fcn  '  respectively.  This  can  also  be  written  as  data 
pairs  [ (y^»t^)  ,  i = 1,2, . . . ,n} .  Thus  the  number  of  failures  in 
time  interval  (t^_^,t^)  is  (y^-y^_^)  for  i =  1, 2, 3, . . . , r ,  where 
tg  =  0  and  yQ  =  0  .  We  will  obtain  the  Maximum  Likelihood  Estimates 
a  and  b  of  a  and  b  ,  respectively.  To  do  this,  we  first  write 
the  joint  density  and  obtain  the  likelihood  function,  and  then  the 
log -likelihood  function.  Next,  we  take  the  partial  derivatives  of 
the  log -likelihood  function  with  respect  to  a  and  b  and  equate 
them  to  zero  for  maximization.  The  solutions  of  the  resulting  two 
equations  are  the  desired  values  (a,b). 

Now,  to  get  the  joint  density,  we  note  that  in  our  notations 
y1#y2,...,yn  are  the  observed  values  of  N(t^) ,N(t2) ,. . . ,N(t  )  , 
respectively.  Hence  from  Equation  (3.25), 

P(N(tx)  =y1  ,  N(t2)  =y2 . N(tn)  =  yj 

n  [m (t . )-m (t .  .)]  1  -{m(t.)-m(t  )] 

=  IT  - - - — - e  1  x~l  (4.1) 

1=1  (yryi-i’1 

-bt . 

where  m(t^)  =  a(l-e  1)  . 


It  is  well  known  that  the  likelihood  function  for  the  para- 
meters  is  simply  the  ]omt  density  of  y 2> ' • ’  n  >  but  now  these 
values  are  known  constants.  Substituting  for  m(t^)  in  Equation  (4.1), 
the  likelihood  function  for  (a,b)  given  the  data  (%,t)  is 


n 


L(a,b  \y_,  t)  =  IT 
i=l 


(a(e 


-bt 


i-1 


-bt . 


-  e 


±n 


yryi-i 


-bt 


-a ( 1-e 


n, 


(4.2) 


Taking  the  natural  logarithm  of  Equation  (4.2)  yields: 

n  n  -bt .  .  -bt . 

XnL(a,b  l£,  t)  =  T.  (y.-y.  ,)j&na+  T.  (y.-y.  Ointe  1-  -e  x) 
i=l  1  1  x  i=l  1  X“-L 

n  -bt 

“  ±1Xn(yi"yi-l):  “  a(1"e  }*  (4-3) 


As  mentioned  above,  the  maximum  likelihood  estimates  (mle's)  are 
those  values  of  a  and  b  which  maximize  £nL(a,b  1^,  t)  ,  i.e.,  which 
satisfy  (for  brevity  we  write  L  to  denote  L(a,bl^,J:)) 

ainL  _  Q 

Sa 

and  .  (4.4) 

d£nL  n 
6b 


By  taking  the  partial  derivatives  of  Equation  (4.3 )  and  equating 
them  to  zero,  we  obtain  after  some  simplification  (recall  that 

y0  =  °). 


-bt 


a  (1-e 


)  =  y 


n 


(4.5) 


and 
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-bt .  -bt .  1 

"btn  n  (y.-y^j  jft.e  1  -  t.^e  ) 

at  e  =  E - 

n  i=1  ~bti  -bti-i 

e  —  e 


(4.6) 


As  can  be  easily  seen,  all  the  quantities  in  Equations  (4.5) 
and  (4.6)  are  known  except  a  and  b  .  Since  these  equations  do  not 
yield  simple  analytical  forms,  we  must  resort  to  numerical  methods 
for  their  solution.  The  resulting  values  of  a  and  b  are  the 
mle's  a  and  b,  respectively. 

It  should  be  pointed  out  that  even  though  the  mle's  are  the  desired 
values,  it  is  often  useful  to  study  the  log-likelihood  surface  as  a 
function  of  parameters  a  and  b  .  For  given  data,  a  plot  of  the 
log- likelihood  surface  can  be  obtained  by  solving  Equation  (4.3)  for  a 
grid  of  values  of  a  and  b .  If  the  plot  is  flat,  it  would  indicate 
a  large  variability  associated  with  the  mle's  while  a  sharp  sur¬ 
face  is  an  indicator  of  low  variability.  A  surface  with  sharp  rises 
and  falls  might  cause  problems  in  numerical  solution  of  Equations 
(4.5)  and  (4.6),  while  a  well-behaved  surface  would  ensure  rapid 
convergence  to  the  values  a,  b  . 

4.1.1  Variance-covariance  and  confidence  region  for  (a,b) 

In  addition  to  the  mle's  a,  6  ,  we  generally  want  to  quantify 
the  region  in  which  the  true  values  a,  b  might  lie  with  a  specified 
degree  of  confidence.  This  is  referred  to  as  obtaining  the  100(l-o)% 
joint  confidence  region  for  (a,b).  In  general,  it  is  not  possible 
to  get  the  exact  confidence  region  (see  Reference  20)  because 

A  a 

the  true  distribution  of  (a,b)  is  unknown.  However,  mle's 
have  a  very  desirable  property 


».Je  /wi»  *  ■ 


that  they  are  asymptotically  normally  distributed, if  the 

data  size  is  large.  In  practice  a  sample  size  of  approximately 

20  (n~20)  should  be  satisfactory  to  use  this  property. 

Also  of  great  usefulness  is  the  invariance  property  of  the 
mle's,  i. e. ,  a  function  of  (a,b)  can  be  estimated  by  using  the 
mle's  a,  b  and  this  function  will  also  be  a  mle.  This  will  be 

useful  for  estimating  N(t)  ,  R(t)  ,  etc. 

Formally,  as  indicated  above,  the  mle's  are  normally  dis¬ 
tributed  for  large  n,  i.ef. 


That  is  , 


(4.9) 
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r  ,  =  r. 
ab  ba 


p  d2lnL 
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(4.10) 


r.  ,  =  -  E 

bb 


d^lnL 

ab2 


(4.11) 


Taking  the  appropriate  partial  derivatives  of  Equation 

(4.3)  and  substituting  in  Equations  (4.9),  (4.10)  and  (4.11),  we 

obtain  after  some  simplification,  (recall  that  E(N(t.)J  =m(t.)  = 
-bt .  11 

a (1-e  x)  : 
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aa 
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i=l 


-bt 
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i-1 


-bt 


ab 


_  r_ 
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(4.12) 


(4.13) 


and 


bb 


n 

=  a[  £ 
i=l 


-bt .  .  _  -bt .  0  -bt 

"(ti+Ve  fcne  n] 


(4.14) 


Substituting  these  expressions  in  Equation  (4.8),  we  get  the 
variance-covariance  matrix  for  (a,b).  Thus  the  asymptotic  distri- 
bution  of  (a,b)  is  completely  specified  if  (a,b)  are  known.  However, 
(a,b)  are  of  course  not  known.  Therefore,  we  use  the  estimates 
(a,b)  for  (a,b)  in  Equations  (4.7),  (4.12),  (4.13)  and  (4.14)  to 

get  estimates  of  the  parameters  of  the  asymptotic  bivariate  normal 
distribution. 
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Now,  the  correlation  coefficient  between  a  and  b  is  esti¬ 


mated  as 


Cov(a,b) 

Jvar  (a)  ,  Var  (b) 


(4.15) 


where  Var(a),  Var(b),  Cov(a,b)  are  obtained  from  Equations  (4.8) 
to  (4.11). 

Finally,  to  obtain  the  100(l-a)%  confidence  regions  for  a 
and  b  we  use  the  following  approximation  (Reference  55) 


or 


*  -v  12 

XnL(a,b ly, t)  -  £nL(a,b  ly,  t)  =  -5-  x9. 

fc  ^  /  Of 

^  12 

£nL(a,b  ly,  t)  =  £nL(a,b ly , t)  -j 

^  ^  /  Ol 


(4.16) 


where  Xnl,(a,b  |y,  t.)  represents  the  value  of  the  log-likelihood 

A.  * 

function  at  a  =  a  and  b  =  b . 

Substituting  Equation  (4.3)  in  Equation  (4.16)  we  get 

n  n  -bt .  -bt., 

£  (y,  -  y.  )  Xna  +  Z  (y.-y.  ,)xn(e  1  -  e  1  ) 

i=l  1  i=l  1  1-1 

n  -bt 

-  E  in  (y.-y.  ,)i-a(l-e  n)  =  C  ,  (4.17) 

i=l  1 

where 

C  =  jenL(a,b  ly,  t)  *2ja  •  (4.18) 

Equation  (4.17)  defines  a  contour  of  the  100(1-0-)%  confidence 

A  A 

region.  For  given  data,  a,  b,  and  o.  Equation  (4.17)  can  be 
solved  for  those  values  of  a  and  b  which  satisfy  it.  (For  com¬ 
putational  purposes,  it  is  easier  to  take  values  of  a  (>a)  and 
solve  for  the  corresponding  values  of  b  . ) 


4.2  Estimation  When  Times  Between  Failures  Are  Given 

Now  we  consider  the  case  when  data  is  available  in  the  form 
of  times  between  individual  failures.  As  mentioned  earlier,  such 


data  is  not  common  and  is  rarely  available. 

Recall  that  X-,  ,X0,  .  . .  ,X^  denote  the  times  between  failures  and 
n 

S  =  E  X.  .  Then  the  data  is  in  the  form  x=  (x,  ,x0# . . .  ,x  )  and 
n  1  —  1  2  n 

s  =  e  x.  .  The  distribution  of  times  between  failures  was  discussed 
n  i-1  1 

in  Section  3.5  and  is  obtained  from  Eauations  (3.23)  and  (3.24),  as 

n 

-b  E  s  .  —bs 

,  *  ,  -u \ n  i=l  -a  ( 1-e  n)  . 

<p  (s. , .. .  ,s  )  =  (ab)  e  •  e 

O  -»#•••#  £>  -L 

1  n 

The  likelihood  function  for  a,  b  ,  given  £ ,  is  the  same  as  above 


and  can  be  written  as 

n 

-b  £  s.  — bs 

_ ,  ,  ,  ,  .  ,  . n  i=l  -a ( l~e  ) 

L(a,bls^)  =  (ab)  -e  •  e 


(4.19) 


Then  the  log  (natural)  likelihood  is 

n  -bs 

XnL(a,b  l£)  =  n£na  +  n,2nb  -  b  E  s.-a(l-e  ).  (4.20) 

i=l  1 

To  get  the  maximum  likelihood  estimates  a,  £ ,  we  take  the  partial 
derivatives  of  Equation  (4.20)  and  equate  them  to  zero,  i.e.. 


SlnL  _  0 
&a 

and 

Slnh  R 
Sb 


(4.21) 


(4.22) 


These  equations  yield 
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n 

a 


1  - 


-bs 

e 


n 


and 


n 

b 


-bs  n 

as  •  e  n-b  £  '  s  .  . 
n  i-1  1 


(4.23) 


(4.24) 


As  in  the  first  case,  these  equations  do  not  yield  simple 
analytical  solutions  and  have  to  be  solved  numerically.  The  solutions 
of  Equations  (4.23)  and  (4.24)  are  the  mle's  a  and  b . 

Regarding  the  asymptotic  distribution  of  (a,b) ,  recall  that 
(see  Section  3.4),  the  joint  density  of  is  improper. 

Therefore,  the  asymptotic  properties  of  mle's  do  not  hold  in 
this  case. 

To  obtain  the  100(l-cv)%  confidence  regions  for  (a,b)  we 
use  the  same  approximation  as  was  used  in  Section  4.1,  viz. 


XnL(a,b |sj  -  XnL(a,b IsJ  =  |  x^.  .  (4.25) 

From  Equations  (4.20)  and  (4.25),  a  contour  of  the  100(1-Q')% 
confidence  region  is  obtained  as 


n  -bs 

nXna+nXnb-b  E  s.-a(l-e  n)  =  c  ,  (4.26) 

i=l  1 

where 


C 


A  A 

XnL(a,b |sj 


1 

2 


(4.27) 


As  before,  Equation  (4.26)  can  be  solved  for  given  £ 

A  A 

a,  b,  and  a  to  get  the  desired  contours. 
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5.  G00DNE5S-0F-FIT  TEST 

A  non-homogeneous  Poisson  process  model  was  proposed  in 

Section  2  to  describe  the  software  failure  phenomenon.  The  mean 

value  function  of  this  model  was  given  in  Equation  (2.10).  In 

this  section  we  describe  the  Kolmogorov-Smirnov  goodness-of-f it 

test  (K-S  Test)  to  check  whether  this  model  provides  a  good  fit 

to  a  given  set  of  failure  data. 

Consider  the  case  when  the  data  are  given  as  a  sequence  of 

software  failure  times  s=(s,,s»,...,s  ).  We  want  to  test 

—  '12  n 

whether  the  events  £  are  generated  from  a  NHPP.  Suppose  that 

0<S.  <S-<...  <S  are  the  random  times  at  which  the  first  n 
12  n 

events  occur  in  a  NHPP  with  unknown  mean  value  function  m(t)  . 

We  wish  to  test  the  simple  hypothesis 


H0 

:  m(t) 

— 

3 

O 

rt 

for 

t  >  0  , 

versus 

H1 

:  m(t) 

¥ 

3 

O 

*r+ 

for 

t  >  0  . 

.  .  0 

Writing  mQ(t)  =aQ(l-e  )  ,  the  hypothesis  Hq  can  be  written  as 

-v 

Hq  :  m(t)  =  a0(l-e  u  )  for  t  >0.  (5.1) 

For  testing  purposes  we  need  the  joint  conditional  distribution 
of  the  failure  times.  The  following  theorem  is  useful  in  deriving 
this  distribution. 

Theorem  5.1.  Given  that  N(t)  •- n  ,  the  n  failure  times 

0 < S n  ^ S_ < . . .  < S  in  the  interval  [0,t]  are  random  variables 
12  n 
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whose  joint  conditional  distribution  is  the  same  as  the  distribu¬ 
tion  of  the  order  statistics  of  a  random  sample  of  size  n  from 

X-  V*  cl  1  i-  1  Vm  i  I  . — .  I  '  T  \ ^  ^  ^  ^  I — L  f\  c-''  V  .  4- 


the  distribution  G(x)  = 


m(t) 


for  0  <  x  <  t  . 


For  proof  of  Theorem  5.1  see  Cox  and  Lewis  (Reference  11), 

Corollary  5.1.  Given  that  S  =t,  the  (n-1)  failure  times 

0<S.<So^...<S  .  have  the  same  joint  conditional  distribution 

as  the  order  statistics  of  a  random  sample  of  size  (n-1)  from  the 

distribution  G  (x)  =  . 

m(t) 

This  Corollary  easily  follows  from  Theorem  5.1. 

Using  Corollary  5.1,  we  reduce  the  hypothesis  of  Equation 


(5.1)  to 


m0(x) 


H0  :  G(x)  =  gq(x)  =  m^TtT  f°r  t 


(5.2) 


For  our  case  we  have 


HQ  :  G(x)  = 


,-V 

■^v 


for  0  <  x  <  t  . 


(5.3) 


Note  that  the  expression  in  Equation  (5.3)  represents  a  truncated 
exponential  distribution. 

We  now  consider  the  Kolmogorov-Smirnov  (K-S )  goodness-of-fit 
test  (References  53,  55).  Given  the  values  of  a  random 
sample  of  size  n-1  ,  si» s2 ' ' *  * ' sn-i'  we  define  the  sample  cdf  by 
Hn  ^(x)  =  k/(n-l),  where  k  is  the  number  of  sample  values  <x. 

Thus  Hn_^(x)  is  a  step  function  which  is  zero  for  x  less  than 
,  has  a  jump  of  l/(n-l)  at  each  s^  ,  and  is  1  for  x  greater 
than  or  equal  to  sn_^  .  That  is  , 
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0  ,  x  < 

k/(n-l)  ,  sk_1£x<sk,  k  =  2,3 . n-1.  (5.4) 

1  •  xisn-l 

Since  H  ^  is  a  step  function  and  G  is  monotonically  increasing 
and  continuous,  it  suffices  to  test  the  absolute  deviations  at 
the  sample  points  s^  ,  k  =  1, 2 , .  .  .  , n-1  ,  and  then  take  the  maximum 
of  these  (n-1)  values.  The  following  procedure  is  used  for 
calculating  the  test  statistic  D.  For  each  k  =  1, 2 , . . . , n-1  ,  set 

Dk  »  maxjiGolSk’-STI1  •  IGo  <sk>  '  j£l 1 1  • 

Then  set 

D  =  max  { Dv }  .  (5.5) 

k  * 

If  the  value  of  D  calculated  in  Equation  (5.5)  is  greater  than 

or  equal  to  the  critical  value  D  n  ,  we  reject  the  null 

hypothesis  Hq  that  S 1»S2 ,  . . • » sn_^  follow  Gq(x)  ;  otherwise  we 

do  not  reject  the  null  hypothesis.  The  critical  values  D  , 

n- 1 ;  o- 

associated  with  the  K-S  test  at  a  level  of  significance  a  are 
available  from  statistical  tables  (see  Reference  53,  p.  661). 

It  should  be  noted  that  if  the  parameters  of  GQ(x)  are 
estimated  from  the  sample,  the  K-S  test  can  be  used  but  will  give 
extremely  conservative  results.  To  achieve  better  results,  the 
level  of  significance  needs  to  be  adjusted.  One  approach  suggested 


by  Alo.en  (Reference  2)  is  to  test  at  the  5%  level  of  significance  and 
use  the  critical  value  for  the  20%  level  or  test  at  the  1%  level 
and  use  the  critical  value  for  10%  level.  We  will  use  this  approach 
in  our  analyses  in  Sections  7  and  8. 
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Another  use  of  the  K-S  test  in  our  context  is  in  developing 
confidence  limits  for  the  true  cdf  G(x)  .  For  example,  if  we  take 


a  random  sample  of  size  (n-1)  and  use  it  to  construct  the  sample 

cdf  Hn_^(x)  ,  then  we  can  be  I00(l-a)%  confident  that  the  true 

cdf  G(x)  does  not  deviate  from  H  .  (x)  by  more  than  D  ,  . 

n- x  n- x  *  o f 

That  is,  the  100(l-a)%  confidence  limits  for  G(x)  are  given  by 


Hn-l<x)-Dn-l.-«  <  G(x)  <  Hn-l(x,  +  Dn-l;=, 


(5.6) 


These  limits  are  especially  useful  in  the  case  when  the  parameters 
of  Gq(x)  are  to  be  estimated  from  the  data.  For  this  case  the 
null  hypothesis  will  be  rejected  at  a  level  of  significance  a 

if  one  or  more  points  of  Gq(x)  fall  outside  the  100(l-a)%  con¬ 
fidence  limits  given  by  Equation  (5.6).  Otherwise,  it  will  not 
be  rejected. 


6.  GENERAL  METHODOLOGY  FOR  SOFTWARE  FAILURE  DATA  ANALYSIS 


Sections  2  through  5  were  devoted  to  the  development  of 
models,  measures,  estimation  techniques  and  a  goodness-of-f it  test.  i 
In  this  section  we  summarize  the  procedure  for  analyzing  actual 
software  failure  data.  Analyses  of  failure  data  from  two  typical 
systems  are  presented  in  Sections  7,  8  and  9. 

The  step  by  step  procedure  is  shown  in  Figure  6.1  and  des¬ 
cribed  below.  I 

] 

Step  1;  Study  the  failure  data.  : 

The  model  described  in  this  report  assumes  that  the  failure  ! 
data  represents  the  data  collected  after  the  system  has  been 
integrated  and  the  number  of  failures  per  unit  time  is  statistically 
decreasing.  If,  however,  this  is  not  the  case,  the  NHPP  model  of 
Section  2  may  not  yield  satisfactory  results.  Furthermore,  adequate 

amount  of  data  must  be  available  to  get  a  satisfactory  model. 

A  rule  of  thumb  would  be  to  have  at  least  ten  data  points. 

Step  2;  Obtain  estimates  a  and  b  of  parameters  a  and  b  , 
respectively. 

Two  methods  are  available  depending  upon  the  type  of  avail¬ 
able  data. 

If  the  data  is  in  the  form  of  pairs  (t,y) ,  the  maximum 
likelihood  estimates  are  obtained  by  simultaneously  solving 
Equations  (4.5)  and  (4.6).  [Section  4.1] 

If  the  data  is  in  the  form  of  times  between  failures,  the 
maximum  likelihood  estimates  a^e  obtained  by  simultaneously 
solving  Equations  (4.23)  and  (4.24).  [Section  4.2] 
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Step  3;  Obtain  the  fitted  model. 

The  fitted  model  is  obtained  by  first  substituting  a  and  b 
from  Step  2  for  a  and  b ,  respectively,  in  Equation  (2.10)  to  get 
m(t),  and  then  substituting  m(t)  for  m(t)  in  Equation  (2.13). 

At  this  stage,  we  have  a  fitted  model  based  on  the  available 
failure  data. 

Step  4:  Perform  goodness-of-f it  test. 

Before  proceeding  further,  it  is  advisable  to  conduct  the 

Kolmogorov-Smirnov  goodness-of-f it  test  to  check  the  model  fit. 

This  is  done  by  following  the  procedure  of  Section  5.  Specifically, 

the  observed  value  of  D  is  obtained  from  Equation  (5.5)  and 

compared  with  the  critical  value  D  for  a  desired  significance 

n  ;  o 

level  a.  In  general  <*=.05  or  .  10  is  quite  satisfactory. 

If  the  model  fits,  we  can  move  ahead.  However,  if  the  model 
does  not  fit,  we  have  to  collect  additional  data  or  seek  a  better, 
more  appropriate  model.  There  is  no  easy  answer  to  either  how 
much  more  data  to  collect  or  how  to  look  for  a  better  model. 

Decisions  on  these  issues  are  very  much  problem  dependent. 

Step  5;  Compute  confidence  regions. 

It  is  generally  desirable  to  obtain  80%,  90%,  95%  and  99%  join t 
confidence  regions  for  the  parameters  a  and  b  using  the  method  des¬ 
cribed  in  Section  4.  Such  regions  are  given  by  Equation  (4.17)  for 
cumulative  failures  data  and  by  Equation  (4.26)  for  the  times  between 
failures  data. 

Step  6;  Obtain  performance  measures. 

At  this  stage  we  can  compute  various  quantitative  measures 
to  assess  the  performance  of  the  software  system.  Several  useful 
measures  and  expressions  were  given  in  Section  3.  Equations  (3.1), 


mi  ii  inianiBM*  'ritz*mmmim 


W 


(3.3),  (3.5),  (3.6),  (3.8)  and  (3.19)  can  be  used  for  this  purpose. 
The  specific  measures  to  be  employed  will  vary  from  one  application 
to  another.  Confidence  bounds  can  also  be  obtained  for 
these  measures  to  eval\iate  the  degree  of  uncertainty  in  the  computed 
values . 
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7.  ANALYSIS  OF  FAILURE  DATA  FROM  A  LARGE 
SCALE  SOFTWARE  SYSTEM 

The  data  to  be  analyzed  in  this  section  have  been  taken  from 
a  large  scale  project  reported  in  Thayer  et  al.  (1976).  This 
project  represents  an  initial  delivery  of  a  large  command  and 
control  software  package  written  in  JOVIAL/ J4 ,  (JOVIAL  is  a  nigher 
order  language  generally  used  for  Air  Force  Command  and  Control 
applications),  it  consists  of  115,346  total  source  statements  and 
249  routines.  Some  other  characteristics  of  this  project  are 
summarized  in  Table  7.1. 

7.1  Failure  Data 

The  failure  data  used  for  this  study  is  taken  from  the 
Software  Problem  Reports  (SPR's)  generated  during  the  formal  test¬ 
ing  phases  of  this  project.  The  majority  of  software  errors  were 
detected  during  Validation  (Jun  1-Aug  12),  Acceptance  (Aug  13-Aug  24), 
Integration  (Aug  25-Oct  26),  and  Operational  Demonstration  (Oct  27- 
Nov  12)  testing.  However,  operational  data  spanning  a  period 
of  approximately  nine  months  was  also  available  and  is  used  for 
comparison  with  the  predicted  values.  The  only  time  frame  readily 
available  from  the  data  was  the  calendar  day.  The  data  also  con¬ 
tain  the  mistakes  by  the  operators  and  the  "explanatory"  errors, 
i.e.,  corrections  to  make  a  change  to  a  comment  statement  or  those 
errors  for  which  a  "fix"  is  not  to  a  routine.  These  explanatory 
errors  do  or  do  not  indicate  the  type  of  change.  Therefore,  the 
original  data  was  restructured  into  four  sets  of  data  denoted  by 


TABLE  7.1 

SOFTWARE  PROJECT  CHARACTERISTICS 


Size  (Total  source  statement) 

115, 346 

Number  of  routines 

249 

Language 

JOVIAL/ J4 

Formal  Requirements 

To  function  level 

Co-con tractor 

Yes 

Subcontractor 

No 

Operating  Mode 

Batch 

Formal  Testing 

24  Weeks 

Validation 

10 

Acceptance 

2 

integration 

10 

Operational  Demonstration 

2 

I 


DSl,  DS2,  DS3  and  DS4  (Reference  63) .  The  description  and  the 
total  number  of  errors  detected  during  the  formal  testing  phases 
for  each  data  set  are  given  in  Table  7.2. 

In  this  cinalysis  the  number  of  software  errors  detected 
during  formal  testing  is  counted  on  a  weekly  basis.  Also,  for 
each  data  set  the  software  errors  detected  during  the  first  nine 
weeks  are  eliminated  due  to  the  fact  that  we  are  interested  in 
analyzing  the  software  failures  over  the  period  when  they  are 
decreasing.  The  number  of  SPR's  for  the  15-week  period  for  the 
four  cases  (DSl  to  DS4)  are  given  in  Table  7.3. 
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EX2  represents  the  explanatory  errors  which  indicate  type  of  change. 


7.2  Estimation  of  Parameters 

The  data  for  this  project  are  in  the  form  ( t]_*  Y^)  <  (t  ,  y2 ) » 

...,(t  .y.,)  i.e.,as  the  number  of  failures  in  specified  time 

15  1-3  ' 

a 

intervals.  Hence  the  estimates  a  and  b  are  obtained  by  simul¬ 
taneously  solving  Equations  (4.5)  and  (4.6).  Thus,  by  substituting 
the  data  set  DSl  in  Equations  (4.5)  and  (4.6)  and  solving,  we  get 

a  =  1348  ,  b  =  0.124  , 

and  the  fitted  mean  value  function  is 

m(t)  =  1348  (1  - e“°*124t)  ,  t>0. 

This  is  also  an  estimate  of  the  expected  number  of  software  failures 
detected  by  time  t  .  A  plot  of  the  actual  cumulative  number  of 
failures  and  the  fitted  values  is  given  in  Figure  7.1. 


OF  FAILURES 


CONFIDENCE  BOUNDS 


Figure  7.1 


Actual  and  Expected  Cumulative  Number 
of  Failures  and  90%  confidence  bounds 
for  the  N(t)  process  for  data  set  DSl 


7.3  Goodness-of-f it  Test 

The  goodness-of-fit  test  is  now  conducted  following  the  pro¬ 
cedure  discussed  in  Section  5.  Since  the  sample  size  is  15,  the 
null  hypothesis  to  be  tested  can  be  written  as 


HO'W  = 


-e'V1 

’!V15> 


for  i =  1, 2 , . . . , 15  , 


(7.1) 


and  the  sample  cdf  as 


0  ,  x  <  t. 


H (x )  —  |  y j y ^  i  ^"2.—  ^  <  x  <  t  i  —  2, 3, • •  •  ,  15. 


(7.2) 


\  1  /  x>t15 

The  computed  values  of  H(x)  for  various  t^  are  given  in  column  2 
of  Table  7.4. 


Now  we  substitute  bQ  =  b  =  0.124  in  Equation  (7.1)  and  compute 

the  value  of  GQ(ti)  for  i  =  1,  2 ,  .  .  . ,  15  .  These  values  are  given 

in  column  3  of  Table  7.4.  Columns  4  and  5  of  this  table  are  the 

quantities  needed  to  find  D  =  max(D.}  (see  Equation  (5.5)).  From 

k  K 

these  columns  we  find  the  value  of  D  to  be  0.096  corresponding 
to  t .  =  9  . 

To  find  the  critical  value  corresponding  to  sample  size  15 
and  or  =  .05  ,  we  first  note  that  the  parameters  had  to  be  estimated 
in  this  case.  As  mentioned  in  Section  5,  for  a  situation  like  this 
a  suggested  approach  is  to  take  a =.20  to  get  good  results.  From 
the  statistical  tables  (Reference  53,  p.  661),  2^  =  0.266  . 

The  observed  value  D=  0.096  is  less  than  the  critical  value  0.266 
and  hence  we  accept  the  null  hypotheses  of  Equation  (7.1).  Thus 
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TABLE  7.4 


DATA  FOR  KOLMOGOROV— SMIRNOV  TEST 
(DATA  SET  DSl) 


H(ti) 

W  ] 

IG0 (ti)-H (ti) 1 

IG0(t.)-H(t 

0. 1784 

0.1381 

0. 0403 

0. 1381 

0.2979 

0.2601 

0.0378 

0.0817 

0.4587 

0. 3679 

0.0908 

0.0700 

0. 5000 

0.4631 

0.0369 

0.0044 

0.5404 

0.5472 

0.0068 

0.0472 

0.6028 

0.6215 

0.0187 

0.0811 

0.6503 

0.6872 

0.0369 

0.0844 

0.7004 

0.7452 

0.0448 

0.0949 

0.7707 

0.7964 

0.0257 

0.096 

0.8269 

0.8416 

0.0147 

0.0709 

0.8506 

0.8816 

0.031 

0.0547 

0.8875 

0.9169 

0.0294 

0.0663 

0.9359 

0.9481 

0.0122 

0.0606 

0.9903 

0.9757 

0.0146 

0.0398 

1 .0000 

1  .0000 

0 .0000 

0.0097 

9 


we  conclude  that  at  5%  level  of  significance  the  model 


P{N(t)=y] 


fl348a-e-°-:L24t) 


■ii/iq/1  -0.124t. 
j-  e~1348  ( 1-e 


can  be  considered  to  provide  an  adequate  fit  to  data  set  DSl. 

To  further  check  the  adequacy  of  fit,  we  compute  95%  con¬ 
fidence  bounds  on  G(t^)  .  Prom  Equation  (5.6),  these  bounds  are 
given  by 


H(ti>-D15;.05 


G  ( t  i ) 


H(ti) + 


D 


15; .05 


From  the  statistical  tables,  D,r  =  0.366  and  hence  the  95% 
confidence  bounds  are  given  by  H(t^)  +  0.366  .  A  plot  of  these 
bounds  and  the  fitted  values  are  shown  in  Figure  7.2. 
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0.6- 


6  9 

TIME  (  WEEKS) 


Figure  7.2  95%  confidence  bounds  for  the  conditional 

c.d.f.  G(t.)  and  the  fitted  curve  for 
DS1  data 


7.4  Con!  i donee  Region;;  for  (a,!;) 


To  get  an  appreciation  of  the  variability  in  the  estimated 
values  of  a  and  b,  we  now  construct  confidence  regions  for 
(a,b).  Such  regions  are  given  by  Equations  (4.17)  and  (4.18). 

For  a  =  .05  ,  the  95%  joint  confidence  region  will  be  the  solution 
of  the  following  equation; 

15  15  124t .  124t._. 

Z  (y.-y.  .)Xnl348  +  Z  (y.-y.  ,)Xn(e  i  -  e  1  ) 

i=l  1  1“x  i=l  1  1~± 


15  - . 124t. _ 

-  Z  Xn  (y.-y.  .):-a(l-e  ^)  =C, 

i-1  1 


where 


15  15  -. 124t .  -. 124t .  . 

XnL(a,b!y,t)  -  I  (y . -y  )f  n  ( 1348)  +  Z  (y^-y.,)  •  Xn  (e  1  -  e  1  ) 

i=l  1  ~  i=l 


3.5  -.124t. 

Z  Xn(y.~y .  ).'  -  1348(l-e  ), 

i=l  1  1 


and 


A  -  12 

C  -  XnL(a,bly,t)  -  j  ^2;. 05  * 


Data  (y1,t1),(y2,t2),. 


^15^15 


2 

x2; .05 


)  were  given  in  Table 
0.103. 


7.3 


and 


A  plot  of  this  region  is  shown  in  Figure  7.3.  From  this  plot  we 
sec  that  oven  though  the  most  likely  values  of  a  and  b ,  based 
on  the  data,  arc  a  =  1348  ,  b  =  0.124  ,  the  true  values  can  vary  over 
the  entire  region  contained  in  the  95%  contour.  Values  a  =  1450  , 
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k  =  0.11  will  be  acceptable  (with  95%  confidence)  and  so  will 
a  =  1250,  b  =  0.14.  50%  and  75%  confidence  regions  are  also  shown 

in  Figure  7.3  and  can  be  similarly  interpreted. 
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7.5  Variance-Covarinnce  Matrix 

The  variance-covariance  matrix  is  useful  in  quantifying  the 
variability  in  the  estimated  parameters  and  is  obtained  from 
Equations  (4.8),  (4.12),  (4.13)  and  (4.14)  by  substituting 

a  =  a  =  1348  ,  b  = b  =  0. 124,  and  the  actual  data  values  from  Table  7.3. 
For  data  set  DSl,  we  get 


(  2368 
-0.2071 

From  this  we  have 

Standard  Deviation  (a)  =  ,/var  (a)  =  48.66 
Standard  Deviation  (b)  ==  Jvax  (b)  =  0.00745 
Correlation  Coefficient  (a,b)  s  * 


-0.2071 


5.554  x  10 


-5 


-0.2071 


2368) (5.554  X  10 


) 


0.571  . 
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7.6  Runiber  of  Remaining  Errors 


One  useful  quantity  is  the  estimated  number  of  remaining 
errors  in  the  system  after  some  time  t .  This  value  is  obtained 
from  Equation  (3.5)  as 


E{H(t) }  =  a*e 


-bt 


or  E(N(t)J  =  1348e 


-0. 124t 


A  plot  of  this  quantity  is  shown  in  Figure  7.4.  As  expected, 
this  value  decreases  with  time.  Also  shown  is  a  plot  of  the  "actual" 
number  of  remaining  errors  which  is  based  on  the  assumption  that 
all  the  errors  were  found  during  36  weeks  of  operation.  it  should 
be  noted  that  this  assumption  is  made  for  illustration  purposes 
only  and,  in  general,  this  may  not  be  the  case. 

It  would  also  be  interesting  to  compute  confidence  bounds  on 
EN(t).  Such  bounds  can  be  easily  computed  as  follows. 

Let  f(a,b)  denote  EN(t)  .  Then,  it  is  well  known  (References 
53,  55)  that  100(l-a)%  confidence  bounds  for  f(a,b)  are  given  by 


(f  (a,b)  ±  tn_2  .  a/2  /v(  f  (a,b))] 


where 


V(f (a,b 


E 

cov 


a=a  ,  b~b 


(7.3) 


(7.4) 


and  t  ~  /-I  is  the  upper  a/2  percentage  point  of  the  t-distr ibution 

n-z ;  a  /  £ 

with  (n-2)  degrees  of  freedom. 

The  90%  confidence  limits  for  E[N(t)}  for  data  set  DSl  arc 
computed  from  the  above  equations  and  are  plotted  in  Figure  7.4. 


EXPECTED  NUMBER  OF  REMAINING  S/W  ERRORS 


7.7  Software  Reliability 

Software  reliability  is  a  commonly  used  performance  measure 
to  assess  how  reliable  the  system  is  at  various  times.  To  compute 
software  reliability,  we  use  Equation  (3.19)  and  get 


-afe“bs  -  e"b(s+x) ) 

Vsk-i(XlS)=e 


This  gives  the  reliability  after  time  x  starting  from  the  current 
time  s  .  For  example,  starting  from  s  =  15  ,  the  reliability  after 
0.04  weeks,  i.e.,  at  s+x  =  15.04,  is 


R (0. 04  I s=15 ) 


-1348 


(e-(. 124)15 


e-(.124)  (15.04) 


or  R (15 . 04 )  =  0.354. 


To  see  how  reliability  varies  with  time,  a  plot  of  R(xls=15)  is 
shown  in  Figure  7.5. 

To  obtain  confidence  bounds  on  reliability,  we  use  a  pro¬ 
cedure  similar  to  the  one  used  for  getting  bounds  on  E{N(t)]. 

Let  g(a,b)  represent  R(xls=15)  .  Then  the  confidence  bounds 
are  given  by 


(g(a,b)  +  tn_2.a/2*/v(g(a,b)  )] 


where 


V(g  (a/b) )  =  (f§  §§)  £coy 


a=a 

A 

b=b 


(7.5) 


(7.6) 


90^  confidence  bounds  computed  from  these  equations  for  the  given 
data  are  shown  in  Figure  7.5. 


7.8  Summary  of  Analyses  for  DSl  to  DS4 

Analyses  similar  to  those  for  data  set  DSl  were  undertaken 
for  data  sets  DS2#  DS3  and  DS4  of  Table  7.3.  A  summary  of  the 
results  is  given  in  Table  7.5. 
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TABLE  7.5 


A  SUMMARY  OF  DATA  ANALYSES 


~~  Data  Set 

Quantity 

DS1 

DS2 

DS3 

DS4 

a 

1348 

1823 

3958 

3446 

b 

0.124 

0.112 

0.0768 

0.0771 

yVar(a) 

48.7 

62.2 

147.3 

136.6 

/  A  A 

V  Var(b) 

0.00745 

0.00643 

0.00460 

0.00492 

‘Pa,b 

-0.571 

-0.648 

-0.856 

-0.855 

Estimated  Number  of  Remain¬ 
ing  Errors  at  the  end  of 

Operational  Demonstration 

209 

338 

1212 

1050 

Number  of  Errors  Detected 

During  Nine  Months  of 

Operation 

198 

263 

540 

475 

i 


i 
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8.  ANALYSIS  OF  FAILURE  DATA  FROM  NAVAL 
TACTICAL  DATA  SYSTEM  (NTDS) 

Jelinski  and  Moranda  (Reference  31)  first  analyzed  some  software 
failure  data  from  the  U.S.  Navy  Fleet  Computer  Programming  Center. 
Since  then  this  data  set  has  been  used  by  several  investigators 
for  model  validation  purposes.  in  this  section  we  analyze  the 
same  data  set  to  see  how  good  the  NHPP  model  is  in  modelling  these 
failures.  In  the  next  section  we  will  compare  the  results  from 
the  NHPP  model  with  those  of  Jelinski  and  Moranda. 

The  data  set  was  extracted  from  information  about  errors  in 
the  development  of  software  for  the  real-time,  multi-computer  com¬ 
plex  which  forms  the  core  of  the  Naval  Tactical  Data  System  (NTDS). 

The  NTDS  software  consisted  of  some  38  different  project  schedules. 
Each  module  was  supposed  to  follow  three  stages:  the  production 
(or  development)  phase,  the  test  phase,  and  the  user  phase.  Many 
of  the  "trouble  reports"  or  "software  anomaly  reports"  were  gen¬ 
erated  whenever  a  system-level  symptom  of  a  deficiency  was  noted 
by  operators  or  users.  A  proper  trace  back  to  the  exact  cause  ir 
software  of  this  symptom  was  done  by  personnel  familiar  with  the 
entire  system.  However,  Jelinski  and  Moranda  felt  that  it 
was  better  to  analyze  the  data  from  isolated  modules  than  from  the 
total  system,  due  to  the  fact  that  many  of  the  modules  did  not 
evolve  in  the  fashion  indicated.  One  of  the  larger  modules,  denoted 
by  A-module,  had  the  desired  pattern.  The  times  (in  days)  between 
failures  for  this  module  are  shown  in  Table  8.1.  Twenty-six 
software  errors  were  found  during  the  production  phase  and  five 


TABLE  8.1 


SOFTWARE  FAILURE  DATA  FROM  NTDS 


ERROR  NO. 

TIME  BETWEEN  FAILURES 
x^»  days 

CUMULATIVE  TIME 

s  =£  x,  ,  days 
n  k 

Production 

(Checkout)  Phase 

1 

9 

9 

2 

12 

21 

3 

11 

32 

4 

4 

36 

5 

7 

43  j 

6 

2 

45 

7 

5 

50 

8 

8 

58 

9 

5 

63 

10 

7 

70 

11 

1 

71 

12 

6 

77 

13 

1 

78 

14 

9 

87 

15 

4 

91 

16 

1 

92 

17 

3 

95 

18 

3 

98 

19 

6 

104 

20 

1 

105 

21 

11 

116 

22 

33 

149 

23 

7 

156 

24 

91 

247 

25 

2 

249 

26 

1 

250 

Test  Phase 

27 

87 

337 

28 

47 

384 

29 

12 

396 

30 

9 

405 

31 

135 

540 

User  Phase 

32 

258 

798 

Test  Phase 

33 

16 

814 

34 

35 

849 

p 

r~ 


additional  errors  during  the  test  phase.  The  last  error  was  found 
on  4  Jan  1971.  One  error  was  observed  during  the  user  phase  on 
20  Sept  1971  and  two  more  errors  (5  Oct  1971,  10  Nov  1971)  during 
the  test  phase.  This  indicates  that  a  re-work  of  the  module  had 
taken  place  after  the  user  error  was  found.  A  more  detailed  de¬ 
scription  of  the  NTDS  software  can  be  found  in  Jelinski  and 
Moranda. 


Data  Analyses 

The  data  in  this  case  is  available  as  times  between  software 

failures  and  hence  the  method  described  in  Section  4.2  will  be  used 

for  estimation  of  parameters.  We  consider  the  first  26  data  points 

26 

in  Table  8.1,  for  which  n=26  and  s0fc  =  £  x,  =  250  days. 

b  k=l 

To  get  an  appreciation  of  the  likelihood  function  associated 
with  this  data  set,  the  log-likelihood  from  Equation  (4.20)  is 
plotted  in  Figure  8.1.  We  see  that  the  surface  rises  sharply 
along  the  b-axis  and  is  relatively  flat  along  the  a-axis. 

The  maximum  of  this  surface  is  obtained  by  solving  Equations 
(4.23)  and  (4.24).  Substituting  the  appropriate  values  from 
Table  8.1  in  Equations  (4.23)  and  (4.24)  we  get 


26  =  i _  e-b(250) 
a 


(8.1) 


and 

^  =  a(250) -e“b(250)  -  b(250)  .  (8.2) 

Solving  Equations  (8.1)  and  (8.2),  numerically,  we  get 
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Likelihood  Surface  Based  on  NTDS  Data 


a  =  33.99 


and 


b  =  0.00579 

as  the  mle's  for  a  and  b,  respectively.  The  fitted  mean  value 
function  is 

m(t)  =  33.99(1  -  e“°*00579t )  .  (8.3) 

and  is  shown  in  Figure  8.2,  along  with  the  actual  data  (determination 
of  the  confidence  bounds  will  be  discussed  later). 


Goodness-of-f it  test 

We  now  perform  the  Koltnogorov-Smirnov  goodness-of-f it  test 
to  check  the  adequacy  of  the  fitted  model.  Now,  using  Corollary  5.1 
and  the  results  in  Section  5,  we  conduct  the  test  based  on  26-1 =25 
points.  The  hypothesis,  from  Equation  (5.2),  is 

-b  x 
1—  e  ^ 

H0  •  )  —  (250)  for  0  <  x  <  2  50  ,  (8.4) 

1-e  0 

and  the  sample  cdf  is 

10  ,  x  < 

k/25  ,  sk_1<x<sk,  k  =  2, 3, .  .  .  ,25  .  (8.5) 

1  •  x  —  s25 

The  values  of  sk  and  H(sk)  are  given  in  Table  8.2.  To  compute 
G0(sk)  for  various  sk  values,  we  replace  bQ  by  b  in  Equation  (8.4) 
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NO.  OF  ERRORS 


Figure  8./-  Plots  of  Mean  Value  Function  and  90%  Confidence 
Hounds  for  the  N(t)  Process  (NTDS  Data) 
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TABLE  8.2 


KOLTOGOROV-SMIRNOV  TEST 
FOR  THE  NTDS  DATA  SET 


k 

** 

H(sk) 

W 

IGo(V-H<sfc)l 

IGo(V-B(8k-l>1 

1 

9 

0.04 

0.0664 

0.0264 

0.0664 

2 

21 

0.08 

0. 1497 

0.0697 

0. 1097 

3 

32 

0.12 

0.2211 

0.1011 

0.1411 

4 

36 

0. 16 

0.2460 

0.0860 

0.1260 

5 

43 

0.20 

0.2882 

0.0882 

0.1282 

6 

45 

0.24 

0.2999 

0.0599 

0.0999 

7 

50 

0.28 

0. 3286 

0.0486 

0.0886 

8 

58 

0.  32 

0.3730 

0.0530 

0.0930 

9 

63 

0.36 

0.3996 

0.0396 

0.0796 

10 

70 

0.40 

0.4357 

0.0357 

0.0757 

11 

71 

0.44 

0.4407 

0.0007 

0.0407 

12 

77 

0.48 

0.4703 

0.0097 

0.0303 

13 

78 

0.52 

0.4751 

0.0449 

0.0049 

14 

87 

0.56 

0.5174 

0.0426 

0.0026 

15 

91 

0.60 

0.5355 

0.0645 

0.0245 

16 

92 

0.64 

0.5399 

0. 1001 

0.0601 

17 

95 

0.68 

0.5532 

0. 1268 

0.0868 

18 

98 

0.72 

0.5661 

0. 1539 

0. 1139 

19 

104 

0.76 

0.5915 

0.1685 

0. 1285 

20 

105 

0.80 

0.5956 

0.2044 

0. 1644 

21 

116 

0.84 

0.6395 

0.2005 

0. 1605 

22 

149 

0.88 

0.7557 

0. 1243 

0.0843 

23 

156 

0.92 

0.7776 

0.1424 

0.1024 

24 

247 

0.96 

0.9946 

0.0346 

0.0746 

25 

249 

1.00 

0. 9982 

0.0018 

0.0382 
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and  obtain  Column  4  of  Table  8.2.  Entries  in  columns  5  and  6  are 
easily  obtained  from  Columns  3  and  4.  Now,  from  Equations  (5.5) 
and  (8.4) 


D  = 


max{ 

k 


lG0(sk}  -«(sk)  MG0(sk)  -H(sk_1)J 


In  other  words,  D  is  the  largest  entry  in  Columns  5  or  6  and  is 
seen  to  be 


D  =  0. 2044  . 

To  test  at  a =  .  05  ,  we  use  a  critical  value  corresponding  to  a=  .20 
as  discussed  in  Section  5. 

From  statistical  tables, 

D25;0.2  =  0,208 

Since  D<D25;o.2  ,  we  accept  the  null  hypothesis,  HQ  ,  at  5%  level 
of  significance. 

The  100 ( 1-a )%  confidence  limits  tor  G(x)  can  now  be  calculated 
from  Equation  (5.6). For  example,  for  o/  =  0.05  we  have  nt-  =  0.264, 

so  that  the  lower  and  upper  confidence  bounds  are 

L(x)  =  maxfH (x)  -  0. 264,0] 

and 


U(x)  =  min[H(x) +  0.264, lj  , 

where  II  (x)  is  given  by  Equation  (8.5).  The  95%  bounds  for  G(x)  , 
along  with  GQ(x)  ,  are  shown  in  Figure  8.3.  We  see  that  the  fitted 
model  seems  to  be  adequate. 
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C.D.F. 


Figure  3.3  95%  confidence  bounds  for  the  conditional 

c.d.f.  G(x)  and  the  fitted  C.JJ.F.  curve 
(NTDS  di.tn) 
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Having  established  that  the  model  provides  a  good  fit,  var¬ 
ious  performance  measures  of  interest  can  be  obtained  by  substitu¬ 
ting  the  estimated  values  of  a  and  b  in  the  appropriate  equations 
of  Section  3. 

The  estimated  mean  value  function,  as  given  in  Equation  8.3, 
m(t)  =  33.99(1-3  ^.00579t^  A  plot  of  m(t)  and  the  actual  number 
of  errors  detected  during  the  production  period  for  this  case  was 
given  in  Figure  8.2.  Also  shown  were  the  90%  confidence  bounds 
for  the  N(t)  process  as  computed  from  Equation  (3.1). 

The  100(1-0-)%  confidence  regions  for  a  and  b  are  obtained 
from  Equations  (4.26)  and  (4.27)  following  a  procedure  similar  to 
the  one  detailed  in  Section  7.  These  are  shown  in  Figure  8.4  for 
a  =  0.05,  0.25,  and  0.50. 

Next,  an  estimate  of  the  expected  number  of  errors  remaining 
in  the  software  system  at  t  =  250  days,  given  that  N(250)  =  26,  is 
obtained  from  Equation  (3.8)  as 

E{N(250)  IN(250)=26]  =  33.99  -  26  =  7.99  . 

As  indicated  in  Table  8.1,  eight  errors  were  found  during  usage 
of  the  system,  subsequent  to  the  production  phase.  The  excellent 
match  between  the  predicted  and  actual  values  is  coincidental  and 
in  general  the  NHPP  model  is  not  expected  to  perform  this  well. 

Finally,  software  reliability ,  R  .  (xl250)  ,  can  be  com- 

X27IS26 

puted  from  Equation  (3.19).  For  example,  the  reliability  values 
after  x=5,  10,  20  and  30  days  are  0.796,  0.638,  0.417  and  0.280, 
respectively.  Thus  the  probability  that  the  system  will  operate 


PARAMETER 


Figure  0.4  Joint  Confidence 


without  any  failures  for  30  additional  days  in  0.26.  As  seen 
from  the  data  in  Table  8.1,  the  system  did  operate  without  any 
failures  for  87  days  subsequent  to  failure  number  26. 


9.  A  COMPARISON  OF  NTDS  DATA  ANALYSES  USING 
THE  NI1PP  AND  THE  DE- EUTROPHICATION  MODELS 

As  mentioned  in  Section  8,  the  NTDS  data  has  been  analyzed 
previously  by  several  investigators  for  model  validation  purposes. 
The  first  such  analysis  was  undertaken  by  Jelinski  and  Moranda 
(Reference  31)  using  a  De- Eutrophication  model.  In  this  section 
we  provide  a  limited  comparison  of  the  results  of  analyses  using 
the  NHPP  and  the  De-Eutrophication  models. 

For  the  De- Eutrophication  process,  the  cdf  of  ,  the  time 
between  the  (k-l)st  and  the  kth  failures, is  given  by 


‘k(x)  =  I'N-k+i<x) 


=  1  -  e 


-  (N-k+1)  cpx 


k  —  _l,2,.«.. 


(9.1) 


where  N  is  the  number  of  errors  in  the  system  at  time  zero  and  cp 
is  the  error  occurrence  rate  per  error. 

The  likelihood  function  for  N  and  cp  for  given  data 

x1,x2 , . . . ,xn  is 

L(N,cplx)  -  fH_k+l(V 

n  -  (N-k+1)  cpx, 

=  n  (N-k+1  )cpe  K 

k=l 

and  the  log -likelihood  is 


n  n 

^nL(N,cp!x)  =  nincp  +  E  Xn(N-k+l)  -  E  (N-k+l)cpx^  .  (9.2) 

k=l  k=l  K 
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The  maximum  likelihood  estimates  of  N  and  cp  are  the  values 
N  and  cp,  respectively,  that  maximize  Liquation  (9.2).  Taking  the 
partial  derivatives  of  Equation  (9.2)  and  equating  them  to  zero, 
the  likelihood  equations  which  the  mle's  must  satisfy  are 


V  ^  ~  ^  N-k+1 

k=l  k  k=l  N  + 


n/cp  =  E  (N-k+1  )x,  . 
k=l  K 


(9.3) 


(9.4) 


For  the  first  26  points  in  Table  8.1,  the  solutions  of  Equa¬ 
tions  (9.3)  and  (9.4)  are  N=31.2  and  cp  =  0.00685  .  In  other  words, 
the  initial  number  of  errors  is  estimated  to  be  31.2  and  the  failure 
rate  is  estimated  to  be  0.00685  errors  per  error  day.  There¬ 
fore,  the  estimated  number  of  errors  remaining  at  the  end  of  the 
production  phase  (i.e.,  at  t=  250  )  is  N-  26  -5.2. 

The  estimates  of  a  and  b  at  t  =  250  ,  as  obtained  in 
Section  &,  were  a  =  33. 99  and  b=  0.00579. 

It  can  be  easily  shown  that  for  the  De-Eutrophication  process 
the  expected  number  of  errors  detected  by  time  t  is  given  by 


MM(t)  =  N(l-  e_Cpt)  . 
N 


(9.5) 


Substituting  for  N  and  cp  , 


r.  t/i  -0.00685t. 

NL. (t,  =  31 . 2  ( 1  —  e  '  ) 


(9.6) 


For  the  N(t)  process  (NHPP  model),  the  expected  number  of  errors 
by  time  t,  as  given  in  Equation  (8.3),  is 


E  (N  (t)  ]  -  m(t)  =  33.99(1  - e"°-00579t)  . 


(9.7) 


v.  =  4*  ^ 


Equations  (9.6)  and  (9.7)  represent  the  same  physical  quantity, 
plots  of  Mjjft)  and  m(t)  are  shown  in  Figure  9.1.  The  actual 
number  of  errors  detected  by  time  t  is  also  shown.  A  comparison 
of  the  plots  shows  that  results  for  the  NIIPP  and  De-Eutrophication 
processes  are  quite  close. 

The  mean  time  to  the  kth  failure  (after  the  (k-l)st  failure) 
is  the  reciprocal  of  the  parameter  (N-k+l)cp  in  Equation  (9.1),  i.e., 

E[Xk3  =  (N-k+l)cp  '  k  =  1, 2,  . .  .  ,  (9.8) 

or 

E[Xk}  =  (31. 2-k+l jo. 00685  '  (9*9) 

The  values  for  k=l,2,...,31  were  computed  and  are  shown  in 
Table  9.1.  as  was  pointed  out  in  Section  4.2,  the  MTTF  for  the 
N(t)  process  does  not  exist  due  to  the  fact  that  the  distribution 
of  is  improper.  For  the  sake  of  comparison,  however,  we  use 

the  inverse  transformation  of  the  mean  value  function  to  get  the 
estimate  of  time  to  kth  failure  as  follows. 

sR  =  m-1(k) 

=  - y in(l-k/a) 
b 

=  ~  0.00579  3  •  99 )  ,  k  =  1,  2 , .  .  .  . 

Hence,  we  get 


(9.10) 
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TAB LE  9.1 


Comparison  of  Results  Based  on  the  NHPP  and  the 
De-Eutrophication  Models 
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"N 


The  values  of  x^  for  k=l,2,...,31  were  computed  from  Equation 

(9.10)  and  are  given  in  Table  9.1.  As  a  criterion  for  comparing 

the  results  from  the  two  models,  we  choose  the  sums  of  squares  of 

the  differences  between  the  actual  values  x.  and  the  estimated 

k 

values  x^  for  k=l,2,...,26  and  for  k  =  27 , 28 ,  . .  .  ,  31  ,  i.e.,  we 
use 


and 


26  .  2 
Fit  =  £  (x  -x  )  , 

k=l  K  k 


.31 

Prediction  =  £  (x, -x,  ) 

k=27  K  K 


We  get  Fit  =7169  and  Prediction  = 8220  for  the  De- Eutrophication 
process.  For  the  N(t)  process  we  get  Fit  =7180  and 
Prediction  =  13034.  For  this  criterion,  the  De-Eutrophication  pro¬ 
cess  gives  better  results  than  the  N(t)  process.  However,  NHPP 
gives  better  results  when  the  criterion  is  the  number  of  errors 
remaining  at  t = 250  days.  These  results  are  summarized  in 
Table  9.1. 


Next  we  compare  the  accuracy  of  estimates  (N,cp)  with  that 
of  (a,b)  by  obtaining  joint  confidence  regions  for  (N,ep)  and  (a,b). 

The  joint  100(l-c*)%  confidence  regions  for  (a,b)  are  given 
by  Equations  (4.26)  and  (4.27)  and  were  shown  in  Figure  8.4  for 
a  =  .05,  .25  and  .50. 

To  obtain  the  100(l-a)%  joint  confidence  regions  for  N 
and  cp  ,  we  use  the  same  result  that  was  used  to  get  the  confidence 
regions  for  (a,b),  viz. 
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(9.11) 


PARAMETER  N 


CONFIDENCE  REGIONS 
(NTDS  n=  26) 


PARAMETER  <t> 


Figure  3.2  Joint  confidence  regions  for  N 
and  cp  (NTDS  Data) 
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Vi; 


For  the  KHPP,  the  reliability  function  from  Equation  (3.19)  is 

A 

V 


V  .  IS 

n+1  n 


- ,  -bs  -b (s+x) , 
.  ,  .  -ale  - e  '  '  J 

(x  s)  =  e  *• 


Since  after  n  =  26  ,  a  -  33. 99  ,  b  ~  0. 00579  ,  we  get 

f  -(0.00579)  (250)  - (0. 00579 ) (250+x) , 

Rv  (x 1 250)  =  e  J-:S*yyie  "  e  i 

X27  lb26 


or 


,,,  QQ  -  (.00579)250  ..  -.00579x, 

Rv  ..  (XI250)  .e-33-"'a  t1-*  > 

x27  is26 


or 

Ry  (x | 250)  =  e“7.993(1_e-0.00579x)  .  (9.17) 

X27  lb26 

Plots  of  R~^(x)  and  Rv  (xl250)  for  various  values 

z/  x27lb26 

of  x  are  shown  in  Figure  9.3.  Also  shown  are  the  reliability 

functions  R-,-,  (x)  and  R  (x|540)  computed  from  the  data  for 

X32l531 

the  first  31  failures  given  in  Table  9.1.  The  reliability  after 
n=31  is  monotonically  higher  than  that  after  n  =  26  .  Also,  the 
predictions  from  NHPP  are  somewhat  more  conservative  than  those 
from  the  De-Eutrophication  process.  This  is  what  would  be  expected 
because  of  the  larger  and  more  accurate  estimates  of  the  number  of 
errors  remaining  in  the  system  when  the'  NIJPP  model  is  employed. 
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Fiyure  9.3  Plots  of  Reliability  Functions  Based  on  NHPP 
and  De-Eutrophication  Moaels. 


10.  CONCLUSIONS 


In  this  report  we  proposed  a  simple  but  very  versatile  model 
for  analyzing  failures  in  large  scale  software  systems.  The 
model  was  justified  on  the  basis  of  reasonable  and  realistic 
assumptions  about  the  nature  of  the  failure  phenomenon.  Specific¬ 
ally,  the  model  (Section  2)  is  based  on  a  non-homogeneous  Poisson 
process  (NHPP)  with  a  mean  value  function  m(t)  =  a(l  -  e  ^>t)  . 

The  choice  of  the  form  of  this  mean  value  function  was  also 
justified. 

The  parameters  of  this  model  are  a  and  b,  where  a  is  the 
expected  number  of  failures  that  will  be  encountered  if  the 
system  were  to  be  used  for  a  long  time  and  b  is  the  error  detec¬ 
tion  rate  per  error. 

Several  useful  quantitative  measures  were  proposed  (Section 
3)  for  assessing  software  performance.  These  measures  are  the 
number  of  failures  by  time  t,  number  of  errors  remaining  in  the 
system  at  t,  software  reliability,  etc.  Models  were  also  de¬ 
veloped  for  computing  these  measures  from  actual  failure  data. 

A  methodology  for  obtaining  the  maximum  likelihood  estimates 
of  a  and  b  was  presented  (Section  4)  for  the  cases  when  the  data 
is  given  as  failure  counts  or  as  times  between  failures.  A 
goodness-of-f it  test  based  on  the  Kolmogorov-Smirnov  statistic 
was  developed  (Section  5)  to  check  the  adequacy  cf  the  fitted 
model . 
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Failure  data  from  two  DOD  systems  were  analyzed  (Sections 
7  and  8)  using  the  methodology  presented  here  (Section  6) .  Re¬ 
sults  of  the  analyses  and  a  limited  comparative  study  (Section 
9)  indicate  that  the  NHPP  model  seems  to  do  quite  well  in  ex¬ 
plaining  the  failure  occurrence  phenomenon.  Applications  of 
this  model  to  several  other  data  sets,  not  reported  here,  also 
yielded  satisfactory  results. 

The  model  developed  in  this  report  is  applicable  after  the 
system  error  occurrence  rate  begins  to  decline.  At  present,  all 
available  models  share  this  restriction.  Efforts  are  under  way 
to  develop  a  3-parameter  NHPP  model  which  will  be  applicable 
during  the  integration  phase.  Also,  the  parameters  cannot  be 
estimated  without  available  data.  Work  is  continuing  on  the 
development  of  a  Bayesian  methodology  which  will  permit  deter¬ 
mination  of  a  and  b  when  limited  data  is  available. 
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APPENDIX  A 


DESCRIPTION  OP  SOFTWARE  FAILURE  MDDEIS 

A. 1  Stochastic  Models  for  Times  Between  Software  Failures 

One  of  the  earliest  studies  to  develop  a  model  for  software 
reliability  was  undertaken  by  Jelinski  and  Moranda  (Reference  31). 
They  developed  a  model  for  the  time  between  software  failures, 
making  the  assumption  of  a  uniform  failure  rate.  in  other  words, 
the  software  error  detection  rate  at  any  time  is  assumed  to  be 
proportional  to  the  current  error  content  (the  number  of  remaining 
errors)  of  the  tested  program.  It  is  also  assumed  that  one  error 
is  removed/eliminated  whenever  a  software  failure  occurs.  If  N 
is  the  initial  number  of  errors  in  the  system,  the  number  of  errors 
remaining  after  (i-1)  errors  are  removed  will  be  (N-(i-l)}.  If  cp 
is  the  proportionality  constant,  the  hazard  rate  or  the  error 
detection  rate  between  the  (i-l)st  and  the  ith  failures  is 

z(x.^)  =  cp  [N-  (i-1)  ]  . 

Then  the  Probability  Density  Function  (pdf)  of  X^  ,  the  time 
between  the  (i-l)st  and  the  ith  failures,  is 

-cp[N-  (i-l)Jx. 

f(x±)  =  cp  [N-  (i-1)  ]  •  e  1  . 

This  constitutes  the  basic  model  of  the  so-called  De- Eutrophication 
process.  Statistical  inference  about  the  unknown  parameters,  N 
and  <p  ,  was  discussed  by  Lipow  (Reference  35)  who  obtained  the 
maximum  likelihood  estimates  and  the  variance-covariance  matrix 
for  N  and  cp .  Forman  and  Singpurwalla  (Reference  21)  also  proposed 
a  method  based  on  solving  the  difference  equations  in  N  and  cp  . 


Morunda  (Reference  45)  modified  the  De- Eutrophication  process 
by  assuming  the  failure  rate  to  decrease  geometrically  rather  than 
decreasing  in  constant  steps.  He  further  incorporated  the  class 
of  non-fatal  errors,  while  the  failure  rates  of  successive  errors 
form  a  geometric  progression  whose  initial  term  is  D  and  whose 
ratio  is  k .  This  is  a  superposition  of  a  geometric  De-Eutrophication 
process  and  a  Poisson  process  with  parameter  0  .  The  model  can 
now  describe  the,  burn-in  phase  by  a  De-Eutrophication  process  as 
well  as  the  steady  state  by  a  Poisson  process.  For  a  combination 
Geometric  De-Eutrophication  and  Poisson  Model,  the  failure  rate 
between  the  (i-l)st  and  the  ith  failures  is  given  by 

z(xi)  =  k1-1 D  +  0  . 

Schick  and  Wolverton  (Reference  58)  developed  a  model  whose 
hazard  rate  depends  on  the  testing  time  as  well  as  the  number  of 
remaining  errors.  They  assumed  that  the  hazard  rate  is  a  linear 
function  of  testing  time,  i.e., 

z(xi)  =  «p[N-(i-l)]xi 

where  N  and  cp  represent  the  initial  error  content  and  a  pro¬ 
portionality  constant,  respectively.  It  turns  out  that  the  distri¬ 
bution  of  the  time  between  the  (i-l)st  and  the  ith  failures  is  a 
Rayleigh  distribution  with  parameter  cp  [N- ( i-1) ) /2 .  The  estimates 
of  N  and  cp  can  be  obtained  by  the  method  of  maximum  likelihood. 

Schick  and  Wolverton  (Reference  59)  postulated  another  model  in 
which  the  hazard  rate  is  a  parabolic  function,  instead  of  a  linear 


93 


function,  of  testing  time  .  The  hazard  function  for  this  model 
is 

2 

z(x^)  =  cp  [N-  (i-1)  ]  (-axi  +  bx^  +  c)  ,  a,b,c>0. 

This  yields  an  increasing  number  of  errors  while  a  debugging  effort 
is  in  full  force,  then  reaches  a  maximum,  and  finally  declines  as 
the  number  of  remaining  errors  is  drastically  reduced. 

A  Bayesian  approach  was  taken  by  Littlewood  and  Verrall 
(Reference  36)  to  develop  a  software  reliability  growth  model.  The 
underlying  distribution  of  thd  time  between  the  (i-l)st  and  the 
ith  failures  is  an  exponential  distribution  with  rate  X  ;  i.e., 

-Xx . 

f (xi I  X)  =  Xe  1  . 


The  failure  rate  X  is  treated  as  a  random  variable  with  a  Gamma 
distribution  with  shape  parameter  a  and  scale  parameter  i)(i)  ,  an 
increasing  function  of  i.  The  function  <|i  (i)  is  assumed  to  be 
known  although  it  may  differ  from  program  to  program.  Assuming  a 
uniform  prior  distribution  of  a  ,  one  can  construct  a  data-dependent 
pdf  of  the  time  to  next  failure.  Thus,  the  pdf  of  xr+1  for  given 
n  observations  x]/x2'***'xn  is  given  by 


f(xn+llxl'x2< 
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It  was  shown  that  the  reliability  improves  if  \[i(i)  increases  more 
rapidly  with  i  than  a  linear  function  of  i.  A  goodness-of- 
fit  test  was  also  presented  and  its  use  shown  by  choosing  ^  ( i ) 
to  be  a  polynomial  in  i. 

Most  of  the  stochastic  models  treat  the  software  system  as 
a  black-box.  Littlewood  (Reference  37)  studied  a  model  in  which 
he  incroporated  the  internal  structure  of  the  software  system.  He 
ass\imed  that  the  software  was  composed  of  several  sub-programs 
which  worked  in  continuous  time  by  Markov  switching  among  them¬ 
selves  and  that  the  failures  occurred  according  to  a  Poisson  pro¬ 
cess.  The  failures  in  the  overall  program  were  then  shown  to 
follow,  asymptotically,  a  Poisson  process  whose  failure  rate  can 
be  computed  from  the  failure  rates  of  the  individual  structural  com¬ 
ponents.  By  considering  the  distribution  of  the  cost  associated 
with  failures,  it  was  shown  by  Littlewood  (Reference  38)  that  the 
distribution  of  the  total  vector  cost  due  to  failures  of  subprograms 
during  (0,t)  is,  asymptotically,  multivariate  normal. 

A  key  assumption  made  in  most  of  these  models  is  that  the 
errors  are  removed  with  certainty  when  detected.  However,  as 
pointed  out  in  Miyamoto  (Reference  43)  and  Thayer,  et  al.  (Refer¬ 
ence  67),  in  practice  errors  are  not  always  corrected  when  detected. 
The  above  models  do  not  provide  an  explicit  solution  for  such 
situations. 

To  overcome  this  limitation,  Goel  and  Okumoto  (References 
26,  28,  30)  developed  an  imperfect  Debugging  Model  (IDM).  In 
this  model,  the  number  of  errors  in  the  system  at  time  t,  X(t), 
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is  treated  as  a  Markov  process  whose  transition  probabilities 
are  governed  by  the  probability  of  imperfect  debugging.  Times 
between  the  transitions  of  X(t)  are  taken  to  be  exponentially 
distributed  with  rates  dependent  on  the  current  error  content  of 
the  system.  Expressions  are  derived  for  performance  measures  such 
as  the  distribution  of  time  to  a  completely  debugged  system, 
distribution  of  the  number  of  remaining  errors  and  software  reli¬ 
ability.  Okumoto  and  Goel  (Reference  48)  discussed  methods  for 
obtaining  the  maximum  likelihood  estimates  and  confidence  regions 
for  the  parameters  N  (the  initial  error  content),  q  (probability 
of  imperfect  debugging),  and  X  (failure  rate  per  error). 

The  reliability  models  based  on  time  between  software  failures 
are  summarized  in  Table  A.l. 


! 
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TABLE  A 


A . 2  Stochastic  Models  Based  on  Number  of  Failures 


One  of  the  earliest  models  in  this  category  was  proposed  by 
Shooman  (Reference  61).  He  analyzed  the  actual  error  data  from  func¬ 
tional  tests  after  different  debugging  times,  and  used  the  history  of 
these  errors  to  specify  the  error  detection  rate  function,  p(t)  . 
Hence,  the  total  number  of  errors  removed  during  t  months  of 
debugging  is  taken  to  be 

e (t)  =  C  p (x)dx  . 

J0 

If  we  assume  that  the  total  number  of  errors  in  the  program,  , 
is  constant,  and  that  no  new  errors  are  added  during  debugging, 
then  c(t)-»et/it  as  t  where  IT  is  the  number  of  instructions 
in  the  program.  Then  the  number  of  errors  remaining  at  time  t 
can  be  expressed  as 

er(T)  =  ET/lT  -  € (T)  . 

Assuming  that  the  software  failures  occur  due  to  the  occasional 
traversing  of  a  portion  of  the  program  which  has  one  or  more  errors, 
the  hazard  rate  in  an  operational  phase  for  software  which  has 
been  debugged  for  r  months  must  be  proportional  to  the  number  of 
errors  remaining  at  time  t  j  i.e.  , 

z(t)  =  K  e  r  ( t ) 

-  K[Et/it-^o  p  (x ) dx] 

where  K  is  an  arbitrary  constant.  Since  z(t)  is  constant  over 
the  operational  time  t ,  the  reliability  is  simply 
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-  [K  e  (  t  )  J  t 

R(t)  —  G 

and  the  mean  time  to  failure  is 

MTTF  =  1  /  [K  e  (  t  )  ]  . 

A  similar  approach  was  taken  by  Musa  (Reference  46)  to  develop  an 
execution  time  model.  If  we  denote  the  number  of  inherent  errors 
in  the  program  by  Nq  and  the  net  number  of  errors  corrected  during 
the  execution  time  t  by  T|(t)  ,  then  the  number  of  errors  remain¬ 
ing  at  time  t  is 

M(t)  =  Nq  -  T1(t)  . 

He  assumed  that  (i)  the  errors  in  the  program  are  independent  of 
each  other  and  are  distributed  at  any  time  with  a  constant  average 
occurrence  rate  per  instruction  throughout  the  program,  (ii)  various 
types  of  instructions  are  reasonably  well  mixed,  and  (iii)  the 
execution  time  between  failures  is  large  compared  to  average  instruc¬ 
tion  execution  time.  The  hazard  rate  of  the  errors  is  then  given 
by 

z('0  =  Kf  T)  (  r ) 

=  KfNQ  -  Kf  7]  (  T ) 

where  K  is  the  error  exposure  ratio  and  f  is  the  line.&>-  execution 
frequency.  Furthermore,  assuming  that  the  error  correction  rate 
eqUal  to  the  error  exposure  rate,  z(t),  he  obtained  the 
hazard  rate 
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2  (  t )  =  KfN0  e“KfT  . 

Hence  for  execution  time  t*  ,  projected  from  t  ,  the  reliability 
is  given  by 

R(T,T.)  =  e"Z (T ) T '  _ 

This  is  the  basic  execution  time  model.  The  model  was  then  gen¬ 
eralized  by  introducing  an  error  reduction  factor#  B  ,  ana  a  testing 
compression  factor,  C .  The  relationship  between  the  execution 
time  and  calendar  time  was  also  investigated  by  incorporating  the 
limitations  on  the  availability  of  resources  (failure  identification 
personnel,  failure  correction  personnel,  and  computer  time). 

Taking  a  different  approach,  Schneidewind  (Reference  60)  studied 
the  number  of  errors  detected  during  a  time  interval  and  the  collection 
of  error  counts  over  a  series  of  time  intervals  ,  by  assuming  that 
the  failure  process  is  a  non-homogeneous  Poisson  process  with  an 
exponentially  decaying  intensity  function 

d(i)  =  are  ,  ff,  P  ^0  i  1  =  1,2,...  . 

As  an  extension  to  his  earlier  models,  Moranda  (Reference  45) 
developed  a  geometric-Poisson  model  assuming  that  the  number  »  , 

of  errors  occurring  in  the  ith  interval  is  governed  by  a  Poisson 
distribution  with  parameter  xk^""^  . 
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As  mentioned  earlier,  an  operational  software  system  is  sub¬ 


ject  to  random  failures  caused  by  software  errors  in  the  system. 

The  maintenance/debugging  activity  is  then  undertaken  whenever  a 
software  failure  occurs.  The  system  goes  through  a  maintenance 
phase  to  remove  the  cause  of  failure  and  becomes  operational  as 
soon  as  the  maintenance  activity  is  over.  Trivedi  and  Shooman 
(Reference  68)  developed  an  availability  model  by  considering  the 
sequence  of  operational  and > maintenance  (up  and  down)  phases  of  the 
software  system.  The  distribution  of  times  in  both  states  was 
taken  to  be  exponential  with  a  state  dependent  parameter. 

A  generalized  model  for  the  operational  and  maintenance  phases 
was  developed  by  Okumoto  and  Goel  (References  49,  50).  In  this 
model,  the  time  to  remove  an  error  is  assumed  to  follow  an  exponential 
distribution  with  a  rate  dependent  on  the  current  error  content  of 
the  system.  The  sequence  of  operational  and  maintenance  states 
of  the  software  system  is  formulated  as  a  semi-Markov  process  and 
expressions  are  obtained  for  system  availability  and  other  per¬ 
formance  measures.  They  also  developed  a  nomogram  to  explore  the 
trade-offs  between  the  expected  time  to  a  specified  number  of 
remaining  errors,  which  determines  the  software  operational  per¬ 
formance,  and  the  manpower  requirements  to  achieve  the  desired 
objective. 


A. 4  Combinatorial  Models 


A . 4 . 1  Capture-recapture  sampling 

Mills  (Reference  42)  formulated  the  problem  of  estimating  tne 
number  of  errors  in  a  program  by  using  a  technique  callea  capture- 
recapture  sampling.  In  this  technique,  a  program  containing  an  unknown 
number  n^.  of  indigenous  errors  is  deliberately  'modified'  by 
seeding  a  set  of  known  errors,  n  .  These  errors  could  be  then 
discovered  in  successive  tests,  each  of  which  is  considered  a  trial. 
Then,  the  joint  probability  of  finding  X^.  indigenous  errors  and 
Xg  seeded  errors  is  given  by  a  hypergeometric  distribution. 

Lipow  (Reference  34)  modified  this  problem  by  taking  into  consid¬ 
eration  the  probability,  q,  of  finding  an  error  (of  either  kind)  in 
any  test  of  the  software.  Then,  for  N  statistically  independent 
tests  the  probability  of  finding  Xj  indigenous  and  x..  seeded  errors 
is  given  by 


WXI  •  Vxs  ’  -J-W 
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The  maximum  likelihood  estimates  of  q  and  n^.  are  given  by 


q  = 


XI+Xs 

N 


and 
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if  xT  +  x  >1 
I  s  — 

if  x,  +  x  =0 
I  s 

if  xg  =  0  . 

Basin  (Reference  5)  suggested  a  somewhat  different  procedure, 
the  so-called  two-state  edit  procedure,  where  one  programmer  searches 
for  defects  and  records  n^  errors  out  of  a  total  of  N  unknown 
indigenous  errors.  A  second  programmer  edits  the  program  inde¬ 
pendently  and  finds,  say,  r  errors.  The  two  lists  of  errors  are 
then  compared.  The  probability  that  the  same  k  errors  are  found 
by  the  two  programmers  is  given  by  a  hypergeometric  distribution. 


A. 4.2  Input  data  domain  considerations. 

Software  reliability  assessment  based  on  program  structure  has 
been  proposed  by  Nelson  (Reference  47).  The  reliability  of  a  computer 
program  here  is  defined  as  the  probability  of  the  program  being 
correct  on  any  given  run.  Data  sets  are  used  to  execute  the  pro¬ 
gram  structure.  Each  input  data  set  proceeds  through  a  sequence 
of  segments,  called  a  logical  path,  with  a  branch  to  a  new  segment 
taking  place  at  the  exit  of  each  segment.  The  input  data  space,  E , 
is  partitioned  into  a  small  number  of  disjoint  subsets,  , 
j  = 1, 2, . . . ,k  ,  to  produce  the  operational  profile  probability  distri¬ 
bution,  P(Zj)  .  If  a  program  is  executed  a  total  of  n  times  and 
fj  failures  are  observed  out  of  n^  runs  using  points  from  zj  , 
then  an  estimate  of  program  operational  usage  reliability  is  given 
by 


103 


I 


f . 

Rx  =  1  -  £  ^  p(z  .)  • 
j  j  3 

Further,  assuming  that  the  test  cases  (i.e.  ,  n  executions  of  the 
program)  are  identically  proportional  to  P(z^)  ,  an  estimate  of 
the  observed  (or  assessed)  reliability  is  given  by 


The  techniques  for  developing  a  set  of  test  cases  which  serve 
to  accomplish  a  certain  objective,  e.g.  assurance  that  each  and 
every  structural  element  would  be  exercised  at  least  once  during 
the  execution  of  the  program  with  the  complete  set  of  test  cases, 
was  discussed  by  Brown  and  Lipow  (Reference  8).  They  used  this 
technique  to  show  its  applicability  on  two  fairly  small  programs. 


i 


Very  few  studies  have  been  reported  that  compare  the  per¬ 
formance  of  various  models.  A  comprehensive  study  for  this  objective 
was  undertaken  by  Sukert  (References  63,  64).  In  this  study  he 
analyzed  the  data  from  a  large  command  and  control  system  by  using 
several  software  failure  models,  as  a  result  of  this  study,  he 
pointed  out  the  limitations  and  difficulties  in  using  these  models. 

A  limited  comparison  of  the  models  from  the  quality  assurance 
viewpoint  has  been  reported  by  Sukert  and  Goel  (Reference  65).  A 
description  and  comparison  of  models  has  been  given  by  Yau  and 
MacGregor  (Reference  71). 

Schick  and  Wolverton  (Reference  59)  compared  various  models 
and  indicated  that  the  model  they  had  developed  (Reference  58), 
seems  to  perform  better  than  others.  Recently  Angus,  Schafer  and 
Sukert  (Reference  3)  and  Schafer,  et  al.  (Reference  57)  completed 
a  comparative  investigation  of  several  models  from  the  validation 
point  of  view.  They  analyzed  several  failure  data  sets  and  pointed 
out  the  difficulties  of  parametric  estimation  and  other  limitations 
of  these  models. 
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